Wednesday, December 15, 2010

In-Memory Downloading from SQL Server (2)

This is Part 2 of an article written in March 2010 with the title "Downloading Data into a SQL Anywhere In-Memory Database from Microsoft SQL Server".

Part 1 is here.

Technique 2: Download Via MobiLink

MobiLink is bidirectional synchronization software that ships in the box with SQL Anywhere. MobiLink works with one or more SQL Anywhere "remote" databases and one central "consolidated" database running on SQL Server, Oracle, IBM DB2, Sybase ASE, SQL Anywhere or MySQL.

MobiLink consists of client and server components which connect to the remote and consolidated databases respectively. For the technique described in this section, MobiLink is used in one direction only (download) from a SQL Server consolidated database to a SQL Anywhere remote database. The process works as follows:
a) When the MobiLink server is started, often as a service running on or close to the computer running SQL Server, it establishes an ODBC connection to the SQL Server consolidated database and then waits to be contacted by the MobiLink client.

b) When the MobiLink client is launched to start a synchronization session, it establishes a database connection to the SQL Anywhere remote database, and then it establishes a network connection to the MobiLink server using a proprietary high-level protocol.

c) The MobiLink server then executes a user-written SQL SELECT statement against the SQL Server database and sends the result set to the MobiLink client via the proprietary network protocol.

d) The MobiLink client applies the downloaded result set, row by row, to the SQL Anywhere database using SQL INSERT statements.
Folks familiar with MobiLink will recognize that the above steps do not begin to exploit the features and flexibility built in to MobiLink. When it comes to the subject of downloading data to an in-memory database, the following features may be valuable (but are not shown in this article):
a) The WHERE clause of the download SELECT can include a predicate that selects rows which have been inserted or updated since the previous synchronization, if the rows contain a datetime column that record when the row was inserted or last updated. MobiLink automatically maintains the datetime of the last successful synchronization for this purpose. This technique may be more useful when the data in the in-memory database is saved persistently; i.e., when the -im c option is used instead of -im nw (in-memory with checkpoints instead of no writes).

b) Other predicates may be used to select a subset of rows; e.g., only download this year's data.

c) The download SELECT statements can be a complex as you want, even using stored procedure calls, as long as the result set matches the row layout defined on the remote database. This fact makes it easy to accommodate schema differences between consolidated and remote databases: normalized versus denormalized designs, and so on.
Figure 10 shows how the MobiLink system tables and other MobiLink schema objects are added to the SQL Server database before any synchronizations can be run. MobiLink doesn't require any user-written configuration files; everything necessary to run the server side of a synchronization is stored in the SQL Server database itself.

Figure 10: MobiLink System Setup on SQL Server
Context: SQL Server
"C:\Program Files\Microsoft SQL Server\100\Tools\Binn\sqlcmd.exe"^
  -i "%SQLANY11%\MobiLink\setup\syncmss.sql"^
  -d main^
  -I^
  -P j68Fje9#fyu489^
  -S BRECK-PC\TSUNAMI^
  -U sa
Line 2 in Figure 10 specifies the input SQL command file which is installed as part of the standard SQL Anywhere setup. Included in this file is the CREATE statement for the ml_script table used by MobiLink to store the download SELECT statement mentioned earlier. When the MobiLink server starts a synchronization session, it reads the SELECT statement from ml_script and then turns around and executes that statement against SQL Server. There's lots of other objects defined in syncmss.sql but ml_script is at the heart of things for the technique shown here.

Figure 11 shows how the download SELECT is added to the MobiLink system table on SQL Server.

Figure 11: MobiLink Download Script on SQL Server
Context: SQL Server
USE main
GO

EXECUTE ml_add_table_script 
   'v1',  
   'sa_target',  
   'download_cursor', 
   'SELECT * FROM mss_source' 
GO
Line 4 in Figure 11 launches one of the MobiLink system procedures created by the command shown in Figure 10. In this case, the ml_add_table_script procedure adds a new table-specific MobiLink script to the ml_script table.

Lines 5, 6 and 7 provide the three-part primary key that uniquely identifies the SELECT statement on line 8.

The script version 'v1' on line 5 is used to make it easy to define whole different sets of MobiLink scripts to be used by different versions of your application. In this article, we're only using one version of one single script, so the version isn't important.

Line 6 specifies the remote database table name 'sa_target' to which this script applies: the SQL Server result set produced by the SELECT * FROM mss_source on line 8 is to be sent to sa_target on the SQL Anywhere database.

Line 7 names which "MobiLink event" will launch the script on line 8. The MobiLink server is event driven, with many dozens of connection, synchronization and table-level event. In a complex synchronization application, many hundreds of synchronization scripts are required, for various purposes like authentication, conflict resolution and upload processing as well as download SELECT statements.

Note: The actual schema for the MobiLink system tables is a bit more complex than implied here, so if you go looking at the table layouts you'll find that ml_script has a single-column surrogate primary key, that the relationship table ml_table_script contains the three-part primary key described above, and that integers are used to identify the version and table, via yet more relationship tables... none of which is really important for the discussion here.

In this article exactly one single MobiLink event is used, the table-level 'download_cursor' script. In a perfect world this event would be named 'download_select' because that's what you code: a SELECT, or a procedure CALL which returns a result set, never a cursor definition.

When the MobiLink server reaches the point in the synchronization process when it's time to download data to the sa_target table, it fires the download_cursor event for sa_target and the script on line 8 in Figure 11 is executed.

Figure 12 shows the MobiLink server running on the server computer BRECK-PC, and Figure 13 shows the command that was used to start it. At this point, the MobiLink server is waiting for a synchronization request from the client side.

Figure 12: MobiLink Server Window
Context: SQL Server


Figure 13: Start MobiLink Server
Context: SQL Server
"%SQLANY11%\bin32\mlsrv11.exe"^
  -c "DSN=main_BRECK-PC;UID=sa;PWD=j68Fje9#fyu489"^
  -o mlsrv11_log.txt^
  -vscn^
  -zu+ 
Line 1 in Figure 13 is the filespec for the MobiLink server mlsrv11, and line 2 provides the ODBC database connection string that the MobiLink server will use to connect to the SQL Server database: the ODBC DSN and the SQL Server user id and password.

Line 3 specifies the text file where the MobiLink server should write diagnostic and progress messages, and the -vscn option on line 4 sets the verbosity level: script names, script contents and row counts. Experience has shown that this diagnostic file is critical to the debugging process during both development and production... because even in production, whenever a network is involved, stuff happens.

The -zu+ option on line 5 tells the MobiLink server not to bother authenticating the MobiLink user name sent to it by the MobiLink client. If you need extra security between the MobiLink client and server, you can predefine the MobiLink user name and password and omit this option to force authentication. You can also specify transport layer and end-to-end encryption for the path between MobiLink client and server, but the details of encryption are not discussed in this article.

Figure 14 shows the user DSN that is stored in the registry on the server computer.

Figure 14: ODBC User DSN for SQL Server Database
Context: SQL Server
Windows Registry Editor Version 5.00

[HKEY_CURRENT_USER\Software\ODBC\ODBC.INI\main_BRECK-PC]
"Driver"="C:\\Windows\\system32\\sqlncli10.dll"
"Server"="BRECK-PC\\TSUNAMI"
"Database"="main"
"LastUser"="sa"

[HKEY_CURRENT_USER\Software\ODBC\ODBC.INI\ODBC Data Sources]
"main_BRECK-PC"="SQL Server Native Client 10.0"
Figure 15 shows the MobiLink setup necessary for the SQL Anywhere remote database on the client side. MobiLink uses a simplified "publish subscribe" model where the tables to be synchronized are named as in a "publication" and a MobiLink "user" is linked to that publication via a "subscription".

Figure 15: MobiLink Setup on SQL Anywhere
Context: SQL Anywhere
CREATE PUBLICATION p1 FOR DOWNLOAD ONLY 
   ( TABLE sa_target );

CREATE SYNCHRONIZATION USER "1" TYPE tcpip;

CREATE SYNCHRONIZATION SUBSCRIPTION TO p1 FOR "1";
Lines 1 and 2 in Figure 15 define a publication called p1, consisting of the single table sa_target. In more complex setups, multiple table names can appear, with column name lists to specify subsets of columns and WHERE clauses to control which rows are uploaded.

Normally, tables are subject to upload as well as download but that's changed by the FOR DOWNLOAD ONLY clause. This is done for two reasons: The technique being described here is download only, and the usual MobiLink transaction-log-driven upload process is impossible because transaction logs don't exist for in-memory databases. MobiLink does offer a non-log-based script-driven form of upload, but again, upload isn't needed here.

Line 4 names the MobiLink user id as "1" and specifies TCP/IP as the network transport layer between the MobiLink client and server. In simple setups like this the MobiLink user id uniquely identifies the remote SQL Anywhere database. Other clients would use 2, 3, etcetera, or any other unique VARCHAR ( 128 ) string. This user id is all that's necessary, and it is how the MobiLink server keeps different synchronization sessions apart.

Note: The MobiLink user id is effectively a globally unique remote database identifier that you manage. It is not a database user id.

Line 6 creates the "subscription": user id 1 subscribes to publication p1.

Figure 16 shows the MobiLink client running on the laptop computer PAVILION2, and Figure 17 shows the command that was used to start it. At this point, the MobiLink server has responded and a synchronization is in progress; in fact, all the data has been sent to the MobiLink client and 1.3 million rows have been inserted into the SQL Anywhere database so far.

Figure 16: MobiLink Client Window
Context: SQL Anywhere


Figure 17: Run Synchronization Session via the MobiLink Client
Context: SQL Anywhere
"%SQLANY11%\bin32\dbmlsync.exe"^
  -c "ENG=mem;DBN=mem;UID=dba;PWD=sql"^
  -e "adr='host=BRECK-PC';sv=v1"^
  -o dbmlsync_log.txt^
  -vno 
Line 1 in Figure 17 is the full filespec for the dbmlsync program, and line 2 shows the standard SQL Anywhere database connection string: server or engine name, database name, and SQL Anywhere user id and password.

Line 3 provides extra synchronization parameters, in this case the host address of the MobiLink server which is running on the BRECK-PC computer together with SQL Server, plus the script version sv=v1 to tell the MobiLink server which set of MobiLink scripts to use.

Line 4 specifies the text file where the MobiLink client should write diagnostic and progress messages, and line 5 specifies the -vno sets the verbosity of these messages to include row counts and options used. The client-side diagnostic file is not as useful as the server-side file defined earlier on line 3 in Figure 13... but it sometimes helps.

-- [to be continued] --

No comments: