Unicode

(This entry is part of a series. The audience: SQL Server developers. The topic: SQLite on mobile devices.)


Well, Actually

First, go read this blog entry by Miguel de Icaza. Right now. I'll wait.

Welcome back. Now let me apologize. I don't want to be a pedantic jerk who quibbles about minor details. But the topic here is Unicode, so there really is no other way.

All Unicode, all the time

The relevant difference with SQLite is easy to describe:

  • In the world of Windows and SQL Server, you have all kinds of possible code pages.

  • In SQLite, everything is Unicode.

But if you don't have much context on these issues, I haven't really told you much. Let's go further.

But let's not go too far

I don't want to rewrite articles that have already been written quite well. So you should probably also go read this blog entry by Joel Spolsky. I'll be here when you get back.

OK, now let's get started

SQL Server has two basic ways you store text:

  • You can use the char/varchar types, which can be used with one of several collations, each of which implies a specific code page, which implies a specific character encoding.

  • Or you can use the nchar/nvarchar types (note the extra 'n'), which are Unicode.

SQLite has no such distinction. All text in SQLite is Unicode.

What the heck is Unicode again?

It's a character set: a collection of characters, each with a number that can be used to refer to it.

More specifically, it's the only character set which is [trying to be] complete. If you choose any character set or encoding which is not Unicode, there will be characters you cannot use.

And what's an encoding?

Saying that SQLite uses Unicode doesn't tell you how the text is actually represented. Unicode is not an encoding. It is more abstract than that. There are lots of different ways of representing Unicode as bytes.

Microsoft's sad history with Unicode

In the Windows and SQL Server world, there is a long history of encoding Unicode in 16-bit-ish ways. Originally, this was UCS-2, a simple encoding which represents each character as a 16-bit number. But then the Unicode consortium realized that 16 bits are not enough, so they expanded the space to 32 bits. This left Microsoft in an awkward spot. UCS-2 is a fundamentally defective encoding of Unicode, since there are many characters in Unicode that simply cannot be represented.

If the goal of Unicode is to be complete, it is reasonable to say that, well actually, UCS-2 is not Unicode.

The conceptual replacement for UCS-2 is to use 32 bits for every character. This encoding is called UCS-4 or UTF-32. But now the wasted space for storing a simple English string is getting out of hand. Switching the complete works of Arthur Conan Doyle from ASCII (which is also an encoding) to UCS-4 would take four times as much space.

Gradually, the world seems to be adopting UTF-8 as the most popular Unicode encoding. This is a variable width encoding. Sometimes a single character is represented with just one byte. Sometimes it needs more. That's very unfortunate, but the only fixed width alternative is UCS-4, which is also very unfortunate. Choose which problem you prefer, but keep in mind that almost everybody has chosen to accept the problems of UTF-8.

But Microsoft has so much history with UCS-2 that transitioning everything to UTF-8 would be really hard. So they have been moving from UCS-2 to UTF-16, which is basically a variable width encoding built around a 16-bit unit instead of an 8-bit unit. UTF-16 is approximately the worst correct way of representing Unicode, unless you have invested billions of dollars in the fundamentally broken UCS-2, in which case UTF-16 is a pretty awesome way out of the mess you ended up in.

Just remember that if you're going out tonight to a club for pedantic nerds and you want to impress someone, you've got to keep the terminology straight:

  • Unicode is an abstraction, not an encoding, not a code page, not a data format, and not a font.

  • Saying your text is Unicode says nothing about how it is represented. It might be UTF-8. It might UTF-16. It might be code point numbers handwritten on Post-It notes stuck on the wall. All of these are valid representations of Unicode.

  • If you ever say anything to suggest that you think Unicode is 16 bits per character, you will be identified as clueless.

  • If you say that your text is stored in Unicode, you are not entirely incorrect, but people will wonder about whether you really know the difference between Unicode and the encodings of same.

SQLite

SQLite always uses Unicode to represent text.

(Hopefully you are now screaming at me saying, "Yeah, but which encoding?!?")

The best answer to this question is: SQLite uses UTF-8. Forget about everything else.

A more correct answer to this question is: SQLite uses UTF-8 but also supports UTF-16.

Either way, there is no distinction between char and nchar. There is no way to save storage space in a column by realizing that you only use lower case english characters so it's safe to use char instead of nchar. There are no code pages. There is no way to move your Shift JIS data into SQLite without converting it to Unicode (or storing it as blobs, I suppose).

Summary

Microsoft has done a lot of things right, but its history with Unicode is very unfortunate. And it's not entirely their fault. They adopted Unicode early and it changed underneath them.

With respect to its emphasis on UTF-8, SQLite is far more typical of most non-Microsoft software today.

 

 

Data Types

(This entry is part of a series. The audience: SQL Server developers. The topic: SQLite on mobile devices.)


Different types of, er, types

At the SQL language level, the biggest difference with SQLite is the way it deals with data types. There are three main differences to be aware of:

  1. There are only a few types

  2. And types are dynamic

  3. (But not entirely, because they have affinity)

  4. And type declarations are weird

Okay, so actually that's FOUR things, not three. But the third one doesn't really count, so I'm not feeling terribly obligated to cursor all the way back up to the top just to fix the word "three". Let's keep moving.

Only a few types

SQLite values can be one of the following types:

  • INTEGER

  • REAL

  • TEXT

  • BLOB

The following table shows roughly how these compare to SQL Server types:

SQL Server SQLite Notes
tinyint
smallint
int
bigint
bit
INTEGER In SQLite, all integers are up to 64 bits wide (like bigint), but smaller values are stored more efficiently.
real
float
REAL In SQLite, all floating point numbers are 64 bits wide.
char
varchar
nchar
nvarchar
text
ntext
TEXT In SQLite, all strings are Unicode, and it doesn't care about widths on TEXT columns.
binary
varbinary
image
BLOB Width doesn't matter here either.
decimal
numeric
money
smallmoney
INTEGER ? These are problematic, as SQLite 3 does not have a fixed point type. (In Zumero, we handle synchronization of these by mapping them to INTEGER and handling the scaling.)
date
datetime
datetime2
datetimeoffset
smalldatetime
time
(your choice) SQLite has no data types for dates or times. However, it does have a rich set of built-in functions for manipulating date/time values represented as text (ISO-8601 format), integer (unix time) or real (Julian day).

Types are dynamic

In SQL Server, the columns in a table are strictly typed. If you define a column to be of type smallint, then every value in that column must be a 16 bit signed integer.

In contrast, SQLite's approach might be called "dynamic typing". Quoting from its own documentation: "In SQLite, the datatype of a value is associated with the value itself, not with its container."

For example, the following code will fail on SQL Server:

CREATE TABLE [foo] (a smallint);
INSERT INTO [foo] (a) VALUES (3);
INSERT INTO [foo] (a) VALUES (3.14);
INSERT INTO [foo] (a) VALUES ('pi');

But on SQLite, it will succeed. The value in the first row is an INTEGER. The value in the second row is a REAL. The value in the third row is a TEXT string.

sqlite> SELECT a, typeof(a) FROM foo;
3|integer
3.14|real
pi|text

The column [a] is a container that simply doesn't care what you place in it.

Type affinity

Well, actually, it does care. A little.

A SQLite column does not have a type requirement, but it does have a type preference, called an affinity. I'm not going to reiterate the type affinity rules from the SQLite website here. Suffice it to say that sometimes SQLite will change the type of a value to fit match the affinity of the column, but you probably don't need to know this, because:

  • If you declare of column of type TEXT and always insert TEXT into it, nothing weird will happen.

  • If you declare of column of type INTEGER and always insert INTEGER into it, nothing weird will happen.

  • If you declare of column of type REAL and always insert REAL into it, nothing weird will happen.

In other words, just store values of the type that matches the column. This is the way you usually do things anyway.

Type declarations are weird

In a column declaration, SQLite has a rather funky set of rules for how it parses the type. It uses these rules to try its very best to Do The Right Thing when somebody ports SQL code from another database.

For example, all of the columns in the following table end up with TEXT affinity, which is probably what was intended:

CREATE TABLE [foo] 
(
[a] varchar(50),
[b] char(5),
[c] nchar,
[d] nvarchar(5),
[e] nvarchar(max),
[f] text
);

But in some cases, the rules are funky. Here are more declarations which all end up with TEXT affinity, even though none of them look right:

CREATE TABLE [foo] 
(
[a] characters,
[b] textish,
[c] charbroiled,
[d] context
);

And if you want to be absurd, SQLite will let you. Here's an example of a declaration of a column with INTEGER affinity:

CREATE TABLE [foo] 
(
[d] My wife and I went to Copenhagen a couple weeks ago
    to celebrate our wedding anniversary 
    and I also attended SQL Saturday while I there
    and by the way we saw
    Captain America The Winter Soldier 
    there as well which means I got to see it 
    before all my friends back here in Illinois 
    and the main reason this blog entry is late is 
    because I spent most of the following week gloating
);

SQLite will accept nearly anything as a type name. Column [d] ends up being an INTEGER because its ridiculously long type name contains the characters "INT" (in "Winter Soldier").

Perhaps we can agree that this "feature" could be easily abused.

There are only four types anyway. Pick a name for each type and stick to it. Once again, the official names are:

  • INTEGER

  • REAL

  • TEXT

  • BLOB

(If you want a little more latitude, you can use INT for INTEGER. Or VARCHAR for TEXT. But don't stray very far, mkay?)

Pretend like these are the only four things that SQLite will allow, and then it will never surprise you.

Summary

SQLite handles types very differently from SQL Server, but its approach is mostly a superset of your existing habits. The differences explained above might look like a big deal, but in practice, they probably won't affect you all that much.

 

 

Foreign Keys

(This entry is part of a series. The audience: SQL Server developers. The topic: SQLite on mobile devices.)


I can't use SQLite! It doesn't support foreign keys!

Actually, it does. Since version 3.6.19, released in October 2009. Before the first iPad. Windows 7 was not quite out yet. Miley Cyrus was still Hannah Montana.

You're wrong! This code should fail but doesn't!

CREATE TABLE [to] (pk int PRIMARY KEY);
CREATE TABLE [from] (other REFERENCES [to](pk) );
INSERT INTO [from] (other) VALUES (3);

Fair enough. It is easy to get the impression that SQLite still lacks support for foreign keys because the feature is [usually] turned off by default, to preserve backward compatibility.

How to turn SQLite foreign key support on

To enable the enforcement of foreign key constraints, execute the following SQL statement, shortly after you open your SQLite database connection:

PRAGMA foreign_keys = 1;

I always use SQLite with foreign keys. In fact, I always open SQLite connections with a helper function which (1) opens the SQLite connection and (2) immediately executes the PRAGMA above.

Differences

Once you've got the feature turned on, foreign key support in SQLite is generally similar to what a SQL Server dev would expect. Full documentation is available on the SQLite website, but there are a few things worth highlighting.

  • SQLite supports simple foreign keys (with one column referencing one column), or composite foreign keys (with multiple columns referencing multiple columns).

  • SQLite does support ON DELETE and ON CASCADE.

  • SQLite allows a foreign key constraint to be deferred, which means that its enforcement is postponed until the transaction is committed. This is a handy feature that I really wish SQL Server had.

Shameless, blatantly self-serving commercial remark

Zumero for SQL Server automatically syncs foreign key constraints. Microsoft Sync Framework does not.

So there.

:-)

Summary

PRAGMA foreign_keys = 1; Always.

 

 

UNIQUE NULL

(This entry is part of a series. The audience: SQL Server developers. The topic: SQLite on mobile devices.)


Nullable columns in UNIQUE constraints

The following code will cause a unique constraint violation in SQL Server:

CREATE TABLE foo (pk int PRIMARY KEY IDENTITY, x int UNIQUE NULL);
INSERT INTO foo (x) VALUES (1);
INSERT INTO foo (x) VALUES (2);
INSERT INTO foo (x) VALUES (NULL);
INSERT INTO foo (x) VALUES (NULL);

SQL Server doesn't allow the second NULL value to be inserted for column [x].

But SQLite allows this. See Issues with sync and constraints for more discussion of why this kind of thing matters.

Boxes and Cats

The difference arises from the question of whether NULL is considered a value or the absence of a value.

Suppose you have a set of cardboard boxes. Each box can hold one cat. And the cats must be unique.

SQLite thinks of NULL as an empty box. Having multiple empty boxes does not violate the rule about cats being unique.

SQL Server thinks of NULL as the name of a specific cat. A box cannot be empty. And the NULL cat can only be in one box.

If you prefer an explanation with more technical detail and fewer cat pictures, see this page on the SQLite website, which (at the time of this writing) provides a chart of how 12 different SQL implementations handle 8 different situations with NULL values. An additional remark about the amibiguous language in the SQL92 standard appears in the SQLite FAQ.

Solution 1: Don't have the problem

One possible solution is to never have this problem in the first place. Obviously, this issue won't arise if you don't have any nullable columns in unique constraints.

Solution 2: Don't cause the problem

Another option is to just not have any bugs in your app. :-)

Hmmm. That was a bit too flippant to be helpful. Let me try to explain it a different way:

In many (or most?) situations, SQL constraints are used as a way to prevent bugs in upper layers from screwing up the data. In other words, if a constraint gets violated, that's a sign that something went wrong. A constraint violation is typically not used as control flow of the software under normal usage. So, if a constraint violation happens, it is usually a sign of a bug in an upper layer.

SQL constraints also provide guarantees which allow the code in upper layers to be simpler. Those guarantees are part of the contractual agreement between your application code and your database. If the database is promising that values in column [foo] will never be greater than 5, the application code can trust in that promise. It doesn't need to do range checking on every value it gets from the database.

If SQL constraints did not exist, we would simply have to enforce our data integrity rules in application code instead of at the database level. This is not "The SQL Way", but it is certainly possible. The NoSQL folks have to live this way all the time.

So, you have the option of simply accepting the fact that SQLite will not enforce SQL Server's rule about distinct NULLs. If your app never inserts a NULL into that column, you don't need to worry about it. Or you can make your application code do extra checking to make sure.

Solution 3: Use a trigger to make SQLite act like SQL Server

If you want SQLite to enforce distinct NULLs at the database level in the same way as SQL Server does, you can get that behavior by adding triggers to your SQLite tables. For example:

CREATE TRIGGER "uniqnull" BEFORE INSERT ON "foo" WHEN ( (NEW."x" IS NULL) )
BEGIN

SELECT RAISE(ABORT, 'constraint violation: unique null')
WHERE EXISTS (SELECT 1 FROM "foo" c WHERE ( (c."x" IS NEW."x") ));

END

The syntax of this trigger probably looks strange to you. I'll be writing about the differences between SQL Server triggers and SQLite triggers in another entry in this series.

This trigger will impact performance. You can decide if the tradeoff is worth it for your application. Measure it.

(When Zumero for SQL Server encounters a nullable column in a unique constraint, it automatically adds triggers like this one to the corresponding SQLite table on the client side. Those triggers include a comment about possible performance impact and suggesting that the trigger can simply be dropped if the other tradeoff is preferable.)

Summary

SQLite handles nullable columns in unique constraints differently from SQL Server. This distinction may never cause you problems, but if it does, you have several options for dealing with it.

 

 

Issues with sync and constraints

(This entry is part of a series. The audience: SQL Server developers. The topic: SQLite on mobile devices.)


Fail Fast

Think of a bug as having two parts:

  1. The incorrect code

  2. The visible symptom

The worst bugs are the ones where these two parts are separated.

For example, consider the following function in C:

void crash_now(void)
{
    char* p = NULL;
    *p = 5;
}

This crash will be easy to find and fix, because the incorrect code is very close to the point where the crash is going to occur.

In contrast, the following code is likely going to waste more time:

int count_decimal_digits(int n)
{
    char* p = malloc(64);
    sprintf(p, "%d", n);
    return strlen(p);
}

One of the several bugs in this function is a memory leak. Whatever symptom arises from this leak will almost certain occur much later, making it much more difficult to realize that the incorrect code is right here in this function.

In 1992 I was working at Spyglass (before we joined the browser wars, when our focus was on scientific data visualization tools). We had a product named Spyglass Format which had a bug involving our failure to properly dispose of a handle we got from the Mac palette manager. The visible symptom of that bug was an intermittent, unreproduceable crash. Bugs like that are so hard to find, but this one was unusually difficult, because the crash always happened in a different app, not in Spyglass Format. :-)

Of course, the affected user started by calling the vendor of the other product (which happened to be Apple) about this problem. And of course, Apple was unable to help them. And of course, when they called us to claim that "it seems like MPW only crashes when Spyglass Format is also running", we were initially rather skeptical. The whole thing took months to figure out.

Sync and Constraints

Let's talk about situations where you are using SQLite on a mobile device and synchronizing with SQL Server on the backend.

Compared to an app which does all database operations over REST calls, the advantages of this "replicate and sync" architecture include offline support and much better performance. However, one of the potential disadvantages of this approach is that it can move the symptom of a constraint violation bug far away from the incorrect code that caused it.

In your SQL Server database on the backend, you have constraints which are designed to protect the integrity of your data.

Suppose you have an app which is trying to INSERT an invalid row directly, such as through ADO.NET. The constraint violation will cause an error right away. This is good.

However, in a mobile app which uses "replicate and sync", changes to the data happen in two steps:

  1. The row gets INSERTed into a SQLite database on the mobile device.

  2. Later, the next time that device syncs with the backend, that row will get INSERTed into the actual SQL Server database.

If the new row is invalid (because of, say, a bug in the mobile app), we want the failure to happen when we try the INSERT into SQLite on the mobile device, not [potentially much] later when the sync happens.

Or to put this another way: Any transaction successfully committed to the SQLite database on the mobile device should also succeed when that change is synchronized to the SQL Server backend.

Summary

If SQLite always behaved exactly like SQL Server, this would not be an issue. But there are differences, and that's what this blog series is all about. Several of the entries later in this series will deal with specific cases where SQLite might accept something that SQL Server would not. In a "replicate and sync" architecture, all of these cases deserve a bit of extra attention.

 

 

EF6 on Xamarin: Progress (or lack thereof)

(This entry is part of a series. The audience: SQL Server developers. The topic: SQLite on mobile devices.)


Yes, actually I DO want that

"I want Entity Framework on Xamarin."

Variations of this remark show up regularly in various online forums. The typical response is something like: "You don't want Entity Framework on a mobile device. It's too big and heavy. You should just use sqlite-net."

I hate this reply, but it has [more than] a grain of truth in it. Yes, sqlite-net is very cool, and a much better fit for mobile use cases. Yes, Entity Framework is big and heavy.

But I would bet a dollar that at least one person has claimed that "EF is too big for mobile" and then proceeded to play "Star Wars: Knights of the Old Republic" on their iPhone.

Besides, some people have a a big pile of EF code and need to port it for a mobile app. There are valid reasons to want EF on Xamarin.

Recently I decided to get my hands dirty and try to make this work. I have not succeeded. Yet. But I've made some progress and reached a good place to report my findings. Maybe other people working on this will find some useful information here.

The Goal

  • I'd like to see EF6 code running on Xamarin.iOS and Xamarin.Android.

  • Tooling doesn't matter much. I'm not expecting Xamarin Studio to show me lines and boxes for my model.

  • I would even declare some measure of victory if I had a solution which handled only one of EF's four workflows. I focused on "Code First with Existing Database".

  • The only provider that matters is SQLite. I focused on System.Data.SQLite.

The Challenge

There are basically four different implementations of the platform:

  • .NET is, well, the full platform. The "official" one. From Microsoft.

  • Mono is an open source implementation. It has most everything in .NET. But not all.

  • Xamarin.Android is Mono for Android. It has a lot of Mono, but not all. (Most notably, System.Configuration is missing.)

  • Xamarin.iOS is Mono for iOS. It has even less of the stuff in Mono than the Android version does. (Most notable, it's an AOT compiler.)

I focused on Xamarin.Android.

Two clarifications before I proceed

  • I'm a huge fan of Xamarin. Anything here which looks like criticism is just me being objective about the weaknesses and tradeoffs of their still-very-awesome product suite.

  • Julie Lerman's dog probably knows more about Entity Framework than I do.

Starting point

Entity Framework was open sourced in 2012, and shortly thereafter was made part of Mono 2.11.3.

For a little while, I proceeded under the delusion that I would not have to make any changes to the Entity Framework code. That didn't last long.

When I started hacking and tearing things apart, I began here.

I had no problem building the tree on Windows. With a few small changes to the build files (remove all use of $([MSBuild]::GetDirectoryNameOfFileAbove())), I was able to build on my Mac using:

xbuild EntityFramework.csproj

System.Configuration

Perhaps the biggest issue is the lack of System.Configuration in both of the Xamarin platforms.

Xamarin dude Jonathan Pryor says:

"The problem with System.Configuration is that once it's in the door, the entire XML stack comes along for the ride, and the linker can't remove it because it's used from ~everywhere. System.Xml.dll is 1.2MB, so that would be at least a 1.2MB increase to minimum app sizes."

This is a huge problem for Entity Framework, which uses the app.config file all over the place. I tried several different approaches to deal with this:

  • First I tried to just remove all the code which uses System.Configuration, starting with a hacked up version of AppConfig.cs. Then I realized what a huge change that was, so I reverted and started over.

  • Then I tried to just stub in a minimal, non-functional implementation of only the parts of System.Configuration that I needed to make the build succeed. Then I realized what a huge job that was, so I reverted and started over.

  • Then I tried to bring in Mono's System.Configuration.dll and make it part of my app. This would be the preferable solution anyway, since it would require far fewer code changes to the Entity Framework code while preserving some ability to use an app.config file. But I couldn't get this to work, and somebody at Xamarin support expressed significant pessimism about this path.

  • Finally, I just removed all references to System.Configuration, including AppConfig.cs and everything that references it. This solution is far from ideal, but it did get me moving on to the next problem.

System.Data.Common.DbProviderFactories

Xamarin doesn't support this either, largely for the same reason. It's really just another way of reading an XML config file.

This function is used in a few places in the Dependency Resolution part of Entity Framework.

Right about here is when I got seriously tempted to just remove all the Dependency Resolution code completely. After all, I only care about one ADO.NET provider. Why do I need this big and ultra-powerful config system which is mostly designed to support a diversity of providers that I don't care about?

I talked myself off this ledge before too long. I remove the uses of System.Data.Common.DbProviderFactories, replacing them temporarily with stuff like "return null".

Subclass DbConfiguration

My rationale for removing all these configuration capabilities was that I hoped they would not be necessary if I gave my DbContext subclass an actual connection instead of a connection string.

    public class BloggingContext : DbContext
    {
        public BloggingContext(DbConnection c) : base(c, false)
        {
        }

        public DbSet Blogs { get; set; }
        public DbSet Posts { get; set; }
    }

...

        var sb = new System.Data.SQLite.SQLiteConnectionStringBuilder();
        sb.DataSource = Path.Combine(Path.GetTempPath(), "whatever.db");
        string cs = sb.ConnectionString;

        var conn = new System.Data.SQLite.SQLiteConnection(cs);
        conn.Open();

        using (var db = new BloggingContext(conn))
        {
            var blog = new Blog { Name = "thoughts" };
            db.Blogs.Add(blog);
            db.SaveChanges();

            // Display all Blogs from the database 
            var query = from b in db.Blogs
                        orderby b.Name
                        select b;

            Console.WriteLine("All blogs in the database:");
            foreach (var item in query)
            {
                Console.WriteLine(item.Name);
            }
        }

At some point I realized that Entity Framework needs a lot more info about the provider than just the connection string. Julie's dog would have known better.

So I read about Code-Based Configuration. And then I did this:

    public class MyConfiguration : DbConfiguration
    {
        public MyConfiguration()
        {
            SetProviderServices(
                "System.Data.SQLite", 
                System.Data.SQLite.EF6.SQLiteProviderServices.Instance
                );
        }
    }

And that required me to hack the SQLite provider to make System.Data.SQLite.EF6.SQLiteProviderServices public instead of internal, which seems more correct anyway. I think.

AssociatedMetadataTypeTypeDescriptionProvider

For some reason, this class (in System.ComponentModel.DataAnnotations) is not supported by Xamarin. It's intentional. The code on Github shows that file surrounded by:

    #if !MOBILE

But I don't know why.

I ducked this problem by copying a bunch of stuff from that Mono file into my hacked-up code.

Time for a break

This is where I decided to pause. The next problem was an exception being thrown while Entity Framework was trying to ingest SQLiteProviderServices.ProviderManifest.xml. I stared at this for a while, but didn't get past it.

It's Friday. I spent most of the week on this investigation, and now I need to catch up on some other things.

If/when I make more progress on this, I'll post more info about it.

If you make progress, please let me know.

I am hopeful about this link:

Entity Framework Everywhere

Summary

You don't want Entity Framework on a mobile device. It's too big and heavy. You should just use sqlite-net.

 

 

Feel

(This entry is part of a series. The audience: SQL Server developers. The topic: SQLite on mobile devices.)


Culture Clash

Ever used iTunes for Windows?

Or Microsoft Office on a Mac?

You can usually tell when a piece of software was ported from another platform. Everything is a little slower. Keyboard shortcuts don't follow the conventions. The menu item is called Preferences instead of Options.

Maybe the software works. Maybe there's nothing really wrong with it. But things just don't feel quite right.

And even if the app feels okay to you, your dog knows that something is a bit off.

Sometimes the situation improves 2 or 3 releases later. But not always. And even if it does, the truth is that a cross-platform app may never feel like one which was written by a true disciple of the religion. Evernote will never feel as Deeply iOS-ish as Vesper.

But this is a conscious tradeoff. Vesper will never be as ubiquitous as Evernote. That's important too.

The wind has shifted

If you have self-identified as a "Microsoft-centric developer" for more than, say, four years, then you are probably experiencing a lot of uncomfortable change.

Not that long ago, all your tooling and platforms just felt right. The line between Microsoft Stuff and Other Stuff was pretty clear. As long as you stayed with solutions provided by Microsoft, you generally knew what to expect.

Somewhere along the way, all that got fouled up.

  • Microsoft contributed a ton of code to make node.js run natively on Windows. Without forking it.

  • Azure supports Linux, Java, PHP and Ruby.

  • Git ?!? Really?

Microsoft is embracing outside stuff like never before. The world is different. (Like everything else coming out of Dev Div, if you don't like it, blame Steve Ballmer, and if you do like it, credit ScottGu.)

Some of these technologies got a nice makeover from the folks in Redmond. Like node.js.

Others just suddenly appeared in the Microsoft ecosystem without much prep. Showing up at a nice club in jeans and sneakers. Sitting around looking awkward.

When considered as part of the Microsoft ecosystem, as a possible replacement for SQL CE, this awkwardness is the current status of SQLite. It just doesn't feel right.

Makeover pending?

People have been asking me to cite an official source for my claim that "Microsoft apparently considers SQLite to be the replacement for SQL Server Compact". But I don't really have one.

On the status of SQL CE itself, there is this comment on a Connect item, written by somebody named "Microsoft":

"SQL Server compact edition is in deprecation mode with no new releases planned near future".

But that remark doesn't mention SQLite.

If you Google (or Bing) around a bit, you'll find more "evidence" that would be described as informal or anecdotal. Why isn't Microsoft making a clear statement on this?

I assume it's because SQLite hasn't had its makeover. Maybe one is coming. Heck if I know. I've got no inside info. I just think it's likely that somebody inside Microsoft has a work item entitled "Make the situation better for people using SQLite".

In the meantime, using SQLite feels like you're using something that didn't originally come from Microsoft. Because it didn't. SQLite's core API is in C, and all the function names are lower_case_with_underscores(). Jeans and sneakers.

Why use SQLite if it's so awkward?

Because SQLite is the best available option for SQL on mobile. That's why it is installed on over a billion devices.

And that's why it's gaining traction in the Microsoft community, despite its non-Microsoft-ish feel.

Microsoft isn't an island anymore. Get used to it.

And it's not as bad as all that

Actually, lots of people have done cool stuff to help SQLite shed its awkwardness. Packages with nice installers. ADO.NET support. C# wrappers with MixedCaseIdentifiers. I'll be delving into some of these things later.

 

 

Architecture

(This entry is part of a series. The audience: SQL Server developers. The topic: SQLite on mobile devices.)


No server

A big distinction between SQL Server and SQLite is visible in the fact that only one of them has the word 'server' in its name.

Much like SQL Server Compact (SQL CE), SQLite is designed to run in the same process as your app. It's an embedded database, not a server.

One file

Another way that SQLite is similar to SQL CE is that both of them store a database in a single file. But unlike SQL CE, the size of a SQLite database file is not limited to 4GB. SQLite databases of 50GB are fairly common (on desktops or servers, not mobile devices), and I've heard of people using databases much larger than that.

Open a "connection"

Even though there is no network between your app and your database file, SQLite still uses the terminology of a "connection" to describe the context in which your app interacts with a database. To begin using a database file, you "open" a connection. But you don't need an ODBC connection string or an ADO.NET connection string. Rather, what you need is a filename.

SQLite also supports the ability to have one connection be attached to more than one database, using the ATTACH command. This can be really handy when you want to break your data up into multiple files, while retaining the ability to do SELECTs with JOINs across those files.

Concurrency and threads

SQLite does support concurrent access to the database file. Multiple processes can access the same database file, or multiple threads within the same process. Like any other database, SQLite uses locking to keep everything straight. It supports a locking model where readers do not block writers and writers do not block readers.

This level of concurrency support is very important when developing a mobile app, where environments like iOS have strict rules about keeping the app responsive to the user by performing all blocking or intensive operations in a background thread.

No user logins

As a consequence of its no-server/embedded architecture, SQLite has no support for authentication or user logins. If an app has access to the database file, then it has access to everything inside it.

Summary

SQLite is an embedded SQL implementation which stores a database in a single file. It is a lot like what SQL Server Compact might be if it were cross-platform and open source.

 

 

Introduction

(This entry is part of a series. The audience: SQL Server developers. The topic: SQLite on mobile devices.)


A Wave of Change

More and more of the world's "computers" are actually mobile devices. Perhaps you've noticed. :-)

As of February 2014, this trend is huge and still accelerating. Benedict Evans recently pointed out that Apple sold more iOS devices in Q4 2013 than the number of Windows PCs sold globally: "mobile is the next computing platform and it's a lot bigger than PCs in unit sales, so even the smaller player can overtake the total PC business".

Remember back around 2005 when we started writing all our clients to run inside a web browser? It was a great disturbance in the industry, as if millions of desktop GUI toolkits suddenly cried out in terror and were suddenly silenced.

This mobile wave feels even bigger than that.

And it's happening, whether we like it or not.

Suppose you are an enterprise CIO. Maybe you don't want everybody switching to mobile. Maybe you don't like this Bring Your Own Device (BYOD) thing, which forces you to deal with security and compatibility and deployment issues for dozens of different mobile devices chosen at the whims of your fickle coworkers.

But that's just how it is.

It is hard to overstate just how big this wave of change is. Some of us will be unaffected, but not many.

However...

One thing that is NOT going away is SQL.

What?!? I heard SQL was dying.

I have nothing against the NoSQL world and I have no interest in this religious war. Couchbase. Cassandra. Hadoop. There's a lot of cool and interesting stuff going on.

But all that stuff exists on a planet where the following things are also true:

  • The vast majority of data in the business world is in some kind of SQL database.

  • Worldwide investment in SQL-related technologies is in the hundreds of billions of dollars. And I may be understating that.

  • The business world makes sweeping technology changes much more slowly than the consumer world. Your bank still runs on COBOL. IBM mostly sells things that people think are dead, and they still have more revenue than Google.

  • Switching a large system from SQL to anything NoSQL-ish is a non-smooth transition. The paradigm is very different.

Most new software projects are not greenfield. Most new code has a requirement to integrate with something else that already exists.

Your IT department is simply NOT going to deal with the mobile wave by rewriting everything on Mongo. They're not going to give up constraints and ACID and foreign keys. They're not going to rewrite all their queries with map/reduce. They're not going to discard a mountain of established code and tooling and expertise. It's just not gonna happen.

NoSQL has its place. But SQL will not disappear during your lifetime.

But there's still that wave...

Your business data is in SQL. Your users are trading in their PCs for smartphones and tablets. If it hasn't happened already, you are going to be asked to let them view/edit that data on their mobile devices.

If you want to create a mobile app, you've got two main options for how that app can interact with data on the backend:

  1. Remote procedure calls: Keep a live network connection to the server and do all your database CRUD operations over something like a REST API.

    (See also: "web services")

  2. Replicate and sync: Keep a [partial] copy of the data from the server and do all your database CRUD operations locally on the mobile device, synchronizing changes in both directions when the network is available.

    (See also: "occasionally connected", "offline first")

Or some combination of the two.

There are pros and cons to both of these two approaches. Each of them is popular for different reasons. But my primary area of interest is in the situations where you want to do at least some of your database operations on the device without a network involved.

And if you want to deal with [some of] your data on the device, you probably want to use SQL on that device, right? Especially if you want to port some existing code.

SQL on mobile devices

The bad news: SQL Server Compact won't help you anymore. It is being left behind as Microsoft moves ahead with a mobile strategy built on WinRT. And the world is cross-platform now anyway. Even if Microsoft decides to renew its commitment to a compact edition of SQL Server for its own mobile OS, it is not likely they will support the same solution on iOS or Android.

The good news: SQLite (sqlite.org) is an outstanding small implementation of SQL. It is open source and supported on all mobile platforms. In fact, over a billion mobile devices currently have SQLite preinstalled. It is ubiquitous, reliable, and far more capable than the word "Lite" would seem to imply. Microsoft apparently considers SQLite to be the replacement for SQL CE. If, for whatever reason, you want to use SQL on a mobile device, then SQLite is [almost certainly] your best choice.

About this series of articles

I'm writing about using SQLite when developing mobile apps. And my target audience is folks who are coming from the world of SQL Server and .NET.

There are differences. There is a learning curve. Hopefully this series of articles will help.

Most of the entries will be fairly short and focused on a specific issue.

This is [mostly] not a product pitch

My goal in this series is to make it useful for people whether they use my product or not.

But it's only fair to disclose this right up front: My company makes a product called Zumero for SQL Server, which is designed to synchronize data between SQL Server on the backend and SQLite on a mobile device.

But these articles are not Zumero-specific. Maybe you're doing sync with Microsoft Sync Framework. Or maybe your sync needs are simple and you're just writing it yourself. Or maybe you're not actually doing sync at all.

Whatever. If you are coming from a SQLServer/.NET background and want to know about using SQLite on mobile, I hope you find value in what I'm writing.

(Zumero is relevant to the topic at hand, so I'm sure I'll end up mentioning it. When I do, I'll try to be obvious about it so you can skip those parts if you want.)

 

 

ZSS 1.1

Version 1.1 of Zumero for SQL Server (ZSS) is now available.

This release includes lots of bug fixes and polishing improvements. Version 1.1 also adds compatibility with Windows Azure SQL Database.

But the two main improvements of the 1.1 release are (1) server-side filters and (2) things to make it easier to get started.

Filters

The major new feature in 1.1 is server-side filtering, which allows you to sync partial tables down to the mobile device.

A filter can exclude a table or some of its columns. Or a filter can use a WHERE clause to exclude rows. Or both.

Filters can be mapped to users, so that each user can have their mobile device receive only the information they need.

Example use cases:

  • Exclude the [Salary] column from syncs by people who are not allowed to see it.

  • Exclude a large blob/image column to save bandwidth and storage space on the mobile device.

  • Configure row filters such that field sales people only sync down their own leads.

  • Use row filtering to ensure that only the not-yet-completed tasks are sync-ed to mobile devices.

Unsurprisingly, this has been our top feature request. It is a critical feature for a replicate-and-sync architecture. Filtering allows Zumero to be used with large SQL Server databases without requiring each mobile device to have a copy of unnecessary data.

Easier to get started

A big focus for the 1.1 release has been to make it easier for people to get started. We want to remove obstacles and make it painless for people to try Zumero.

First of all, we've got a spiffy new Getting Started guide that holds your hand for the initial config. This guide provides help with the most common technical support questions we get from new users.

On the server side, we've made improvements to the installer to help get everything configured properly. Perhaps the most important change here is a simple wizard to build your ODBC connection string for you.

Power users can still enter any ODBC string they want. The builder feature is simply there for convenience.

On the client side, we've added a Test App (so you can see your data on a mobile device as quickly as possible), and an App Generator (which will provide a buildable skeleton app with a UI customized for your data).

Test App

The ZSS Test App lets you verify your setup all the way to a real mobile device.

It's a simple client app with basic support for viewing data and making simple changes.

Currently, the ZSS Test App is Android-only. An iOS version is on the way.

App Generator

The ZSS App Generator (ZAG) is a desktop app which generates source code and build scripts for a simple, working app that syncs and views data from your SQL Server database.

Many of our customers are not mobile developers (yet!). They want ZSS because they have a bunch of data in SQL Server and they need to build an app to allow mobile users to interact with it. This plunges them into an unfamiliar world of Objective-C and Java and SQLite.

ZAG is a way to get some early traction. The app generated by ZAG can be used as a skeleton for your project. Or simply as customized sample code for copy-and-paste.

Currently, ZAG generates apps only for Android/Java. Support for other targets is on the way. iOS/ObjC will be next. Since ZAG is very new and there is so much cool stuff to be done, it will be moving forward on its own release schedule, not necessarily in lockstep with the ZSS server and client SDKs.

Customer Stories

Some of our early ZSS customers are wrapping up their first projects, which means we can begin sharing stories of their success with Zumero. Here's one from Pilot Flying J, the largest operator of travel centers and travel plazas in the United States.

Lately I find myself preferring to buy gas at the Pilot station not far from my home, but so far I have resisted the temptation to ask the manager to pull out their iPhone and show me their Zumero-based apps in action. :-)

Lots more cool Zumero stuff is on the way!