How Google App Engine Datastore Works

Kyle Roche & Jeff Douglas

(The following is an excerpt from "Beginning Java Google App Engine," published by APress)

Designing highly scalable, data-intensive applications can be tricky. If you've ever used hardware or software load balancing, you know that your users can be interacting with any one of a dozen or so web and database servers. A user's request may not be serviced from the same server that handled his previous request. These servers could be spread out in different data centers or perhaps in different countries, requiring you to implement processes to keep your data safe, secure, and synchronized. The hardware and software required to scale your application can also be complex and expensive, and may even dictate that you outsource or hire dedicated resources.

With App Engine, Google takes care of everything for you. The App Engine datastore provides distribution, replication, and load-balancing services behind the scenes, freeing you up to focus on implementing your business logic. App Engine's datastore is powered mainly by two Google services: Bigtable and Google File System (GFS).).

Bigtable is a highly distributed and scalable service for storing and managing structured data. It was designed to scale to an extremely large size with petabytes of data across thousands of clustered commodity servers. It is the same service that Google uses for over 60 of its own projects including web indexing, Google Finance, and Google Earth.

The datastore also uses GFS to store data and log files. GFS is a scalable, faulttolerant file system designed for large, distributed, data-intensive applications such as Gmail and YouTube. Originally developed to store crawling data and search indexes, GFS is now widely used to store user-generated content for numerous Google products.

Bigtable stores data as entities with properties organized by application-defined kinds such as customers, sales orders, or products. Entities of the same kind are not required to have the same properties or the same value types for the same properties. Bigtable queries entities of the same kind and can use filters and sort orders on both keys and property values. It also pre-indexes all queries, which results in impressive performance even with very large data sets. The service also supports transactional updates on single or application-defined groups of entities.

The first thing you'll notice about Bigtable is that it is not a relational database. Bigtable utilizes a non-relationship object model to store entities, allowing you to create simple, fast, and scalable applications. Google isn't alone in offering this type of architecture. Amazon's SimpleDB and many open-source datastores (for example, CouchDB and Hypertable) use this same approach, which requires no schema while providing auto-indexing of data and simple APIs for storage and access.

You can interact with Bigtable using either a standard API or a-low level API. With the standard API, either a Java Data Objects (JDO) or Java Persistence API (JPA)) implementation, you can ensure that your applications are portable to other hosting providers and database technologies if you decide to jump ship. This makes a good argument for App Engine as it prevents vendor lock-in. If you are certain that your
application will always run on App Engine, you can utilize the low-level API as it exposes the full capabilities of Bigtable. Both APIs achieve roughly the same results in terms of ability and performance, so it comes down to personal preference. Do you like working with low-level database functionality or abstracting this layer so that your experience is applicable across multiple datastore implementations?


The datastore provides full CRUD (create, read, update, and delete) access to entities in Bigtable and allows you to query against the datastore using a standard SQL-like query language called JDOQL. The syntax is enough like SQL to lull you into a sense of familiarity, but there are some differences when dealing with JDOenhanced objects. One notable exception is the lack of support for joins, which is
present in relational databases. However, this is understandable since the datastore is
non-relational.

Working with Entities

The fundamental unit of data in the datastore is an 'entity,' which consists of an immutable identifier and zero or more properties. Once again, entities are schemaless and this allows for some interesting possibilities. Since entities are not required to have the same properties or types, your application must enforce adherence to your data model, whatever that may be at the time. A property can have one or more
values, embedded classes, child objects, and even values of mixed types. Entities are very flexible and are not defined by a database schema as in a relational database. At any point during the application life cycle you can add or remove entity properties. Newly created and fetched entities will utilize this new schema. Your application's logic must be able to handle these changes.

App Engine uses the Java Persistence API (JPA)) and Java Data Objects (JDO) interfaces for modeling and persisting entities. These APIs, rather than the low-level API, ensure application portability. For your application, you'll use JDO since the Eclipse plug-in generates your JDO configuration files. Of course, JPA is supported, but it requires some additional setup and configuration steps. If you are familiar with Hibernate or other object-relational mapping (ORM) ) solutions, JDO should be fairly easy to grok as these solutions share many features.

App Engine's JDO implementation is provided by the DataNucleus Access Platform, an open-source implementation of JDO 2.3. Again, the JDO specification is database-agnostic and defines high-level interfaces for annotating simple POJOs, persisting and querying objects, and utilizing transactions. Applications implementing JDO can query for entities by property values or they can fetch a specific entity from the datastore using its key. Queries can return zero or more entities and sort them by property values, if desired.

Classes and Fields

JDO uses annotations on POJOs to describe how these objects are persisted to the datastore and how to recreate them when they are, in turn, fetched from the datastore. The kind of entity is defined by the simple name of the class while each class member specified as persistent represents a property of the entity. The data class is required to have a field dedicated to storing the primary key of its corresponding entity.

Each entity has a key that is unique to Bigtable. Keys consist of the application ID, the entity ID, and the kind of entity. Some keys may also contain information pertaining to the entity group. Your application can generate keys for your entities, or you can allow Bigtable to automatically assign numeric IDs for you. In most cases it is easier to let Bigtable assign your keys so you don't have to write code to ensure that your keys are unique across all objects of the same kind plus entity group parent (if being used).

There are four types of primary key fields:

1. Long: An ID that is automatically generated by Bigtable when the instance is saved.

2. Uncoded String: An ID or "key name" that your application provides to the instance prior to being saved.

3. Key: A value that includes the key of any entity-group parent that is being used and an application-generated string ID or a systemgenerated numeric ID.

4. Key as Encoded String: Essentially, an encoded key to ensure portability and still allow your application to take advantage ofBigtable's entity groups.

If you want to implement your own key system, you simply use the createKey static method of the KeyFactory class. You pass the method the kind and either an application-assigned string or a system-assigned number, and the method returns the appropriate Key instance.



Add Comment      Leave a comment on this blog post
Jan 10, 2010 2:01 AM narmi91 narmi91  says:
Abstracting away the database seems like a fine idea. Except in real world applications there are always performance limits and bottlenecks. How quickly do thes show up in App Engine Datastore and how easy is it to work around them? What is the role of a good DBA on this platform? Reply
Jan 15, 2010 1:01 PM dizi izle dizi izle  says:
dizi izle I saw an article similar to this web pages. film izle I also share with you will find an. Thanks for all. Reply
Apr 8, 2010 7:04 AM Frank Frank  says:
Hey, Loving your blog, awesome tips on ctoedge you have here. I would just like to ask you some questions privately, mind contacting me at livefaq@ decimaltofraction.com Thanks, Mark http://www.decimaltofraction.com/ Reply
May 15, 2010 8:05 AM izle izle  says:
This site is wondeful.I am very love this blog.Thank you. Reply
May 15, 2010 6:05 PM jergens natural glow jergens natural glow  says:
this is very in trusted topic i like this Reply
Aug 6, 2010 1:08 PM Vending Machine Locator Vending Machine Locator  says:
Buy bulk Vending Locator, Vending Machine Locator. We have call center to provide you best location for your Vending Machine Locatorsin your area Reply
Aug 17, 2010 10:08 AM Pet suppies Pet suppies  says:
ive had alot of troublw with working with the data bases pet supplies Reply
Sep 4, 2010 11:09 AM turnstile turnstile  says:
Hey,
Loving your blog, awesome tips on this you have here. I
would just like to ask you some questions privately, mind Reply
Jan 14, 2011 5:01 PM John Son John Son  says:
The article is really awesome, and I got lots of valuable information from the article, it�s really very helpful for the visitors. Reply
Mar 16, 2011 10:03 AM free newsletter templates free newsletter templates  says:
I am very enjoyed for this blog. Its an informative topic. It help me very much to solve some problems. Its opportunity are so fantastic and working style so speedy. I think it may be help all of you. Thanks a lot for enjoying this beauty blog with me. I am appreciating it very much! Looking forward to another great blog. Good luck to the author! all the best! Reply
Mar 22, 2011 6:03 AM unique cheap wedding invites unique cheap wedding invites  says:
Hey,Loving your blog, awesome tips on this you have here. Iwould just like to ask you some questions privately, mind Reply
Apr 7, 2011 8:04 AM Coach Satchel Bags Coach Satchel Bags  says:
I am glad to read this post, its an interesting one.Coach Satchel Bags Reply
Apr 9, 2011 7:04 AM Tory Burch Boots Tory Burch Boots  says:
Your satisfaction is our #1 Priority! We offer a Hassle-Free 30 Day Money Back Guarantee!Here you can get it.Tory Burch Boots Reply
Apr 14, 2011 6:04 AM cheap p90x dvd cheap p90x dvd  says:
It is the true beauty ,I like it very much ,hope you can post more in the future time.p90x dvd Reply
Apr 14, 2011 6:04 AM Cheap Tory Burch Flats Cheap Tory Burch Flats  says:
Tory Burch is coming . Reply
May 31, 2011 3:05 PM Deadman Deadman  says:
This is definitely a nice site. I would definitely be coming back to it again. Reply
Jun 15, 2011 10:06 PM Pruitt33Sheree Pruitt33Sheree  says:
The business loans suppose to be important for people, which are willing to organize their own career. In fact, that's very easy to get a bank loan. Reply
Aug 4, 2011 2:08 AM alldress alldress  says:
wholesale formal dresses at alldress.co.uk Reply
Aug 8, 2011 5:08 PM coach backpack coach backpack  says:
The service also supports transactional updates on single or application-defined groups of entities. Reply
Nov 21, 2011 7:11 PM boxes for moving boxes for moving  says:
A value that includes the key of any entity-group parent that is being used and an application-generated string ID or a systemgenerated numeric ID. Reply
Dec 8, 2011 7:12 AM Watch Shame Online Free Watch Shame Online Free  says:
Howdy, I read your blog occasionally and i own a similar one and i was just wondering if you get a lot of spam comments? If so how do you prevent it, any plugin or anything you can advise? I get so much lately it's driving me mad so any assistance is very much appreciated. Reply
Dec 17, 2011 1:12 AM website design sydney website design sydney  says:
I truly wanted to construct a note to express gratitude to you for the lovely facts you are giving out at this site. My rather long internet lookup has now been recognized with pleasant information to go over with my best friends. I 'd say that many of us website visitors actually are unquestionably endowed to exist in a notable website with very many special individuals with great things. I feel very much grateful to have encountered the webpage and look forward to many more excellent times reading here. Thanks a lot once more for a lot of things. Reply
Feb 2, 2012 5:02 PM donna morgan dress donna morgan dress  says:
Thanks for a well thought out post. Reply
Mar 17, 2012 7:03 AM michael jackson jacket beat it michael jackson jacket beat it  says:
Thanks a lot for the website.Much thanks again. Really Fantastic. michael jackson jacket beat it Reply
Sep 9, 2016 4:35 AM amazon at rick grimes jacket amazon at rick grimes jacket  says:
Cleared points! I realized many things through this post; I am using the same tactic while now Reply

Post a comment

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

null
null

 

Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.