Posts für Suchanfrage SRPC werden nach Relevanz sortiert angezeigt. Nach Datum sortieren Alle Posts anzeigen
Posts für Suchanfrage SRPC werden nach Relevanz sortiert angezeigt. Nach Datum sortieren Alle Posts anzeigen

26. November 2008

Simple Remote Procedure Call - Array Response

SRPC-ArrayResponse is an extension to SRPC. SRPC-ArrayResponse carries an array of key/value lists as response.

Multiple key/value lists could be encoded as SRPC values with an appropriate escaping and encoding for each list. But SRPC-ArrayResponse presents a standardized way to represent an array of key/value lists instead of the usual one dimensional list.

The normal SRPC response looks like:

  1. a=b
  2. c=d

The ArrayResponse allows for multiple values of similar keys:

  1. 0:a=b
  2. 0:c=d
  3. 1:a=b1
  4. 1:c=d1
  5. 2:a=b2
  6. 2:c=d2

The ArrayResponse allows for multiple values of similar keys:

  1. Status=1
  2. 0:Filename=sp.gif
  3. 0:Data/Encoding=base64
  4. 0:Data/Type=image/gif
  5. 0:Data=R0lGODlhAQABAIAAAP///////yH5BAEAAAEALAAAAAABAAEAAAICTAEAOw==
  6. 1:Filename=sp2.gif
  7. 1:Data/Encoding=base64
  8. 1:Data/Type=image/gif
  9. 1:Data=R0lGODlhDwAOAIAAAP///////yH5BAEKAAEALAAAAAAPAA4AAAIMjI+py+0Po5y02osLADs=

The extension looks very similar to SRPC-Batch. The difference is that SRPC-Batch transfers multiple requests and multiple responses in a single transaction whereas SRPC-ArrayResponse has only one request and an item list in the response.

Rationale: the decoder can be shared for both cases. It is always clear how to interpret the array, because you know, if you expect an array response or multiple responses.

23. Oktober 2009

Database as a Backend Web Service

The database is always the bottleneck. This is what all the admins of massive services tell in talks about their scaling efforts.

In short:

  • Database used to mean SQL
  • It is difficult to scale SQL CPU
  • It is simple to scale Web-Frontend CPU
  • The SQL philosophy puts the burden on Read by enabling very complex SELECTs and JOINs while Write is usually simple with short INSERT. Just the wrong concept in a massive world. We need quick and simple read operations, not complex reporting features.
Therefore many people step back from SQL and use other databases. Read more about the NoSQL movement. You have the choice: CouchDB, MongoDB, Tokyo Tyrant, Voldemort, Cassandra, Ringo, Scalaris, Kai, Dynomite, MemcacheDB, ThruDB, Cassandra, HBase, Hypertable, AWS SimpleDB, or just use Amazon S3 as stupid document store. Also SQL can be 'misused' as quick document/key-value oriented storage. It still has some key benefits.

Basically all you need is a key-value collection store with some indexing, alias document store. Whatever you decide: you are bound to it and this sucks. So, why not decouple the application logic from the database? Decoupling can be done in different ways. Traditionally you had a thin database code layer that tried to abstract from different (SQL) databases. Now, I need more abstraction, because there might well be a non-SQL database in the mix.

I decided to put a web service style frontend-backend separation between application code and database. This makes the DB a web service. In other words: There is HTTP between application and DB which allows for massive scaling. Eventually, my DBs can be scaled using web based load balancing tools. This is great. I can also swap out the DB on a per table basis for another database technology. Also great, because I do not have to decide about the database technology now and this is what this article really is about, right?

So, now I design the DB web service interface. I know what I need from the database interface. This are the requirements:
  1. Database items (think: rows) are Key-Value collections
  2. Sparse population: not all possible keys (think: column names) exist for all items
  3. One quick primary key to access the collection or a subset of key-values per item
  4. Results are max one item per request. I will emulate complex searches and multi-item results in the application (disputed by Ingo, see Update 1)
  5. Required operations: SET, GET, DELETE on single items
  6. Support auto-generated primary keys
  7. Only data access operations, no DB management.

This is the interface as code:

  1. interface IStorageDriver
  2. {
  3. // Arguments:
  4. // sType: Item type (think: table).
  5. // properties: The data. Everything is a string.
  6. // names: Column names.
  7. // condition: A simple query based on property matching inside the table. No joins. Think: tags or WHERE a=b AND c=d

  8. // Add an item and return an auto created ID
  9. string Add(string sType, Dictionary<string, string> properties);
  10. // returns Created ID

  11. // Set item properties, may create an item with a specified ID
  12. void Set(string sType, string sId, Dictionary<string, string> properties);

  13. // Fetch item properties by ID or condition, may return only selected properties
  14. Dictionary<string, string> Get(string sType, string sId, List<string> names);
  15. List<Dictionary<string, string>> Get(string sType, Dictionary<string, string> condition, List<string> names);
  16. // returns The data. Everything is a string

  17. // Delete an item by ID
  18. bool Delete(string sType, string sId);
  19. // returns True = I did it or False = I did not do it, because not exist, result is the same
  20. }

I added the "Add" method to support auto-generated primary keys. Basically, "Set" would be enough, but there are databases or DB schemes which generate IDs on insert, remember?

All this wrapped up into a SRPC interface. Could be SOAP, but I do not want the XML parsing hassle (not so much the overhead). WSDLs suck. Strong typing of web services is good, but can be replaced by integration tests under adult supervision.

On the network this looks like:

Request:
  1. POST /srpc HTTP/1.1
  2. Content-length: 106

  3. Method=Data.Add
  4. _Type=TestTable
  5. User=Planta
  6. Age=3
  7. Identity=http://ydentiti.org/test/Planta/identity.xml

Response:
  1. HTTP/1.1 200 OK
  2. Content-length: 19

  3. Status=1
  4. _Id=57646

Everything is a string. This is the dark side for SQL people. The application knows each type and asserts type safety with integration tests. On the network all bytes are created equal. They are strings anyway. The real storage drivers on the data web service side will convert to the database types. The application builds cached objects from data sets and maps data to internal types. There are no database types as data model in the application. Business objects are aggregates, not table mappings (LINQ is incredibly great, but not for data on a massive scale).

BUT: I could easily (and backward compatible) add type safety by adding type codes to the protocol, e.g. a subset of XQuery types or like here:

  1. User=Planta
  2. User/Type=text
  3. Age=3
  4. Age/Type=int32
  5. Identity=http://ydentiti.org/test/Planta/identity.xml
  6. Identity/Type=url

The additional HTTP is overhead. But SQL connection setup is bigger and the application is INSERT/UPDATE bound anyway, because memcache will be used massively. Remember the coding rule: the database never notices a browser reload.

Now, I can even use AWS S3, which is the easiest massively scalable stupid database, or Simple DB with my data web service on multiple load balanced EC2 instances. I don't have to change anything in the application. I just implement a simple 4-method storage driver in a single page. For the application it is only 1 line configuration to swap the DB technology.

I can proxy the request easily and do interesting stuff:
  • Partitioning. User IDs up to 1.000.000 go to http://zero.domain.tld. The next million goes to go to http://one.domain.tld.
  • Replication: All the data may be stored twice for long distance speed reasons. The US-cluster may resolve the web service host name differently than the EU cluster. Data is always fetched from the local data service. But changes are replicated to the other continent using the same protocol. No binary logs across continents.
  • Backup: I can duplicate changes as backup into another DB, even into another DB technology. I don't know yet how to backup SimpleDB. But if I need indexing and want to use SimpleDB, then I can put the same data into S3 for backup.
  • Eventual persistence:The data service can collect changes in memory and batch-insert them into the real database.
All done with Web technologies and one-pagers of code and the app won't notice.

Update 1:

Supporting result sets (multi-item) as 'Get' response might be worth the effort. I propose to have 2 different 'Get' operations. The first with the primary key and no condition. This will always return at most 1 item. A second 'Get' without pimary key but with condition might return multiple items. (Having both, a primary key and a condition in the 'Get' makes no sense anyway). The multi-item response will use the SRPC Array Response.

On the network:

Request:
  1. POST /srpc HTTP/1.1
  2. Content-length: ...

  3. Method=Data.Get
  4. _Type=TestTable
  5. _Condition=Age=3\nGender=male
  6. _Names=Nickname Identity

Comment: _Condition is a key-value list. This is encoded like an 'embedded' SRPC. A key=value\n format with \n escaping to get it on a single line. _Names is a value list. Tokens of a value lists are separated by a blank (0x20) and blanks inside tokens are escaped by a '\ '. Sounds complicated, but easy to parse and read.

Response:
  1. HTTP/1.1 200 OK
  2. Content-length: ...

  3. Status=1
  4. 0:Planta
  5. 0:Identity=http://ydentiti.org/test/Planta/identity.xml
  6. 1:Wolfspelz
  7. 1:Identity=http://wolfspelz.de/identity.xml

I am not yet decided about queries with multiple primary keys. They could be implemented as
  1. SRPC Batch with multiple queries in a single HTTP request, or
  2. with a specific multi-primary-key syntax, similar to SQL: "WHERE id IN (1,2,3)".
The response would be almost identical, because a SRPC Batch response is very much like SRPC Array Response. Solution 2 adds a bit of complexity to the interface with a new multi-key request field. Solution 1 does not need an interface extension, but puts the burden on the data webservice, which must re-create multi-key semantics from a batch of single-key queries for optimal database access.

Update 2:

I agree with Ingo, that solution 1 (SRPC Batch) makes all operations batchable and has a simple interface at the same time. The trade off, that the webservice must detect multi-key semantics from a batch is probably not too severe. Clients will usually batch ony similar requests together. For the beginning the webservice can just execute multiple database transactions. Later the webservice can improve performance with a bit of code that aggregates the batch into a single multi-key database request.


Update 3:

In order to allow for later addition of type safety and other yet unknown features, I define here, now and forever, that SRPC keys with "/" (forward slash) be treated as meta-data for the corresponding keys without "/". Specifically, that they should not be treated as database (column) names. That's no surprise from the SRPC point of view, but I just wanted to make that clear. I have no idea why someone would use "/" in key names anyway. I find even "_" and "-" disturbing. By the way: ":" (colon) is also forbidden in keys for the benefit of SRPC Batch. In other words: please use letters and numbers, the heck.



Update 4:

I removed the "Database". "Type" is enough for the data service to figure out where to look for the data. "Type" is a string. It can contain "Database/Table".



_happy_decoupling()

23. Mai 2008

Simple Remote Procedure Call - Response Format

SRPC-Response Format is an extension to SRPC. It specifies a request parameter which selects a response format.
This is primarily intended for the REST variant of SRPC where the SRPC parameters are in the request URI and the result is the response body. Usually, the response body carries a data structure. But it is not clear in which format the data is encoded. A "Format" parameter in the request can select if the response is encoded as XML, JSON, WDDX, etc.

  1. Format=<one of: json, php, wddx, xml, yaml, etc.>

Example:

  1. Format=xml

XML Example: HTTP query:

  1. C: GET /srpc.php?Method=GetPrices&Symbol=GOOG&Date=1969-07-21&Format=xml HTTP/1.1

HTTP response (with a sample XML as a the result value):

  1. S: HTTP/1.1 200 OK
  2. S: Content-type: text/xml
  3. S:
  4. S: <?xml version="1.0"?>
  5. S: <prices>
  6. S: <price time="09:00">121.10</price>
  7. S: <price time="09:05">121.20</price>
  8. S: </prices>

JSON Example: HTTP query:

  1. C: GET /srpc.php?Method=GetPrices&Symbol=GOOG&Date=1969-07-21&Format=json HTTP/1.1

HTTP response (with a sample JSON as the result value):

  1. S: HTTP/1.1 200 OK
  2. S: Content-type: application/json
  3. S:
  4. S: {
  5. S: "prices": [
  6. S: { "time": "09:00", "value": "121.10" },
  7. S: { "time": "09:05", "value": "121.20" }
  8. S: ]
  9. S: }

5. August 2007

Simple Remote Procedure Call

In my projects we often use remote procedure calls. We use various kinds, SOAP, XMLRPC, REST, JSON, conveyed by different protocols (HTTP, XMPP, even SMTP). We use whatever is appropriate in the situation, be it client-server, server-service, client-p2p, and depending on the code environment C++, C#, JScript, PHP.
With SOAP and XMLRPC you don't want to generate or parse SOAP-XML by hand. That's an avoidable error source. Rather you use a library, which does the RPC-encoding/decoding job. To do that you have to get used to the lib's API, modes of operations, and its quirks.
This is significant work until you are really in "complete advanced control" of the functionality. Especially, if there is only a method name with paramaters to exchange. Even more bothersome is the fact, that most such libraries need megabytes, have their own XML parser, their own network components. Stuff, we already have in our software for other purposes.
What we really need is a simple way to execute remote procedure calls

  • with an encoding so easy and fail safe, that it needs no library to en/decode,
  • that is so obvious, that we do not need an industry standard like SOAP, just to tell other
    developers what the RPC means.
The solution is a list of key-value pairs. This is Simple RPC (SRPC):
  • request and response are lists of key-value pairs,
  • each parameter is key=value
  • parameters separated by line feed
  • request as HTTP-POST body or HTTP-GET with query
  • response as HTTP response body
  • Content-type text/plain
  • all UTF-8
  • values must be single line (must not contain line feeds)
  • request method as Method=
Example (I love stock quote examples):
HTTP-POST request body:
  1. C: POST /srpc.php HTTP/1.1
  2. C: Content-type: text/plain; charset=UTF-8
  3. C: Content-length: 43
  4. C:
  5. C: Method=GetQuote
  6. C: Symbol=GOOG
  7. C: Date=1969-07-21
HTTP response body:
  1. S: HTTP/1.1 200 OK
  2. S: Content-type: text/plain; charset=UTF-8
  3. S:
  4. S: Status=1
  5. S: Average=123
  6. S: Low=121
  7. S: High=125
Additional options:
1. Multiline Values:
Of course, there are sometimes line feeds in RPC arguments and results. Line feeds must be encoded using HTTP-URL encoding (%0A) or a better readable "cstring" encoding (\n). The encoding is specified as meta parameter:
  1. News=Google%20Introduces%20New...%0AAnalyst%20says...
  2. News/Encoding=URL
or:
  1. News=Google Introduces New...\nAnalyst says...
  2. News/Encoding=cstring
The "cstring" encoding replaces carriage-return (\n), line-feed (\r), and back-slash (\\). The "cstring" encoding indication, e.g. "News/Encoding=cstring" may be omitted.
2. Binary Values:
Binary values in requests and responses are base64 encoded. An optional "Type" uses MIME types to indicate the data type in case of e.g. image data.
  1. Chart=R0lGODlhkAH6AIAAAOfo7by/wCH5BA... (base64 encoded GIF)
  2. Chart/Encoding=base64
  3. Chart/Type=image/gif
3. The Query Variant:
Even complex result values, such as XML data, must be single line. Following the scheme above, this can be done by using "base64" or "cstring" encoding. Both are not easily readable in case of XML. SRPC offers a simpler way to return a single result value: if the request is HTTP-GET with query then the result value comes as response body with Content-type. It's a normal HTTP request, but SRPC conform.
HTTP query:
  1. C: GET /srpc.php?Method=GetPrices&Symbol=GOOG&Date=1969-07-21 HTTP/1.1
HTTP response (with a sample xml as a single result value):
  1. S: HTTP/1.1 200 OK
  2. S: Content-type: text/xml
  3. S:
  4. S: <?xml version="1.0"?>
  5. S: <prices>
  6. S: <price time="09:00">121.10</price>
  7. S: <price time="09:05">121.20</price>
  8. S: </prices>
4. Special Keys
There are 3 special keys defined:
  • request "Method=FunctionName" (RPC method)
  • response "Status=1" (1=OK, 0=error)
  • response "Message=An explanation" (an accompanying explanation for Status=0 or 1)
This Simple RPC specifies exactly how RPC requests are encoded. It's just lists of key=value pairs. But still powerful enough for all RPCs we need.
happy_coding()

21. November 2008

Simple Remote Procedure Call - TCP

SRPC-TCP is an extension to SRPC. It specifies the encoding of SRPC over TCP.
Rules:

  • The message format over plain TCP instead of HTTP is identical to the HTTP POST body.
  • A message is terminated by an empty line, in other words: 2 consecutive newlines.
  • Multiple SRPC messages may be sent over the same TCP connection in both directions.
  • Requests have a "Method" (can, but not has to be the first line).
  • Responses have a "Status" (can, but not has to be the first line).

Example:

  1. C: Method=GetQuote
  2. C: Symbol=GOOG
  3. C: Date=1969-07-21
  4. C:
  1. S: Status=1
  2. S: Average=123
  3. S: Low=121
  4. S: High=125
  5. S:

Options:

  • Events: messages may be sent in one direction without a response,
  • Streaming: multiple request messages may be sent back to back before the corresponding responses are received by the client,
  • Ordering: responses may be sent out of order (needs SrpcId, see below),
SrpcId:
Responses are associated with requests by a special key called SrpcId. If the SrpcId key/value pair is included in a request, then the response must include the same key/value pair without interpreting the value. The SrpcId helps to find the request for a response.

Example:

  1. C: Method=FirstRequest
  2. C: SrpcId=abc
  3. C:
  4. C: Method=SecondRequest
  5. C: SrpcId=def
  6. C:
  7. S: Status=1
  8. S: SrpcId=def
  9. S:
  10. S: Status=0
  11. S: Message=error
  12. S: SrpcId=abc
  13. S:

22. April 2008

Simple Remote Procedure Call - Batch

SRPC-Batch is an extension to SRPC. The batch-mode carries multiple remote procedure calls in a single transaction. The global "Method" indicates the batch mode. Individual RPCs are prefixed by an index, e.g. "1:Method=...".

Example:

  1. C: Method=Batch
  2. C: 0:Method=GetQuote
  3. C: 0:Symbol=GOOG
  4. C: 1:Method=GetQuote
  5. C: 1:Symbol=APPL
  1. S: Status=1
  2. S: 0:Status=1
  3. S: 0:Average=123
  4. S: 0:Low=121
  5. S: 0:High=125
  6. S: 1:Status=1
  7. S: 1:Average=456
  8. S: 1:Low=455
  9. S: 1:High=457

Rationale:

In rare cases clients want to execute not just one, but multiple commands. This saves network bandwidth and roundtrip time, especially on SSL connections. It also allows a batch of RPCs to be executed consecutively. We are using batch commands also to store them in the database and execute multiple commands on request.

Details:

  • the request has a "Method=Batch",
  • the request contains multiple remote procedure calls,
  • parameters of individual RPCs are prefixed by an index N and a colon: "N:", e.g. "1:",
  • the index indicates individual RPCs,
  • all parameters of an individual RPC have the same index,
  • the index starts with "0" (zero),
  • each RPC has a "Method" parameter (1:Method=...),
  • meta parameters as usual: "1:Symbol/Encoding=cstring",
  • the response has a "Status=..." (0/1) indicating success of the batch-parser,
  • the response carries a "Status" for each individual request,
  • result parameters use the same syntax as the requests (1:Status=...),
  • RPC results have the the same index as the corresponding request,
  • the receiver executes ALL commands and returns their result even if some fail.

Comments:

  • the batch-extension is optional. It is not required for receivers. Better ask your server if it is supported,
  • in additon to "Method", the request may have additional "global" parameters.

29. Oktober 2014

Microservices bei Weblin

Und wieder einmal bekommt ein Prinzip, das wir bei Weblin entwickelt und benutzt haben, einen Namen: Microservices.

"Microservices is a software architecture design pattern, in which complex applications are composed of small, independent processes communicating with each other using language-agnostic APIs"

Wir haben es natürlich nicht erfunden. Viele andere gute Softwareingenieure haben zur gleichen Zeit das gleiche gemacht und inzwischen ist das Prinzip (=architecture design pattern) im Mainstream angekommen und hat einen Namen und es gibt viele Artikel und Vorträge.

Es geht darum, dass man nicht eine fette Anwendung macht, sondern mehrere (viele) von einander logisch getrennte Web-Services, die jeweils eine Funktionalität des Gesamtsystems bereitstellen und untereinander kommunizieren. Das betrifft sowohl Client/Server Kommunikation, als auch Server-Frontend/Backend und innerhalb vom Backend. Microservices können in verschiedenen Sprachen geschrieben sein und haben typischerweise jeweils eigene Datenbanken (wenn auch oft auf dem gleichen Datenbankserver). Microservices können horizontal oder vertikal skalieren,d.h. alle können auf der gleichen Server Farm laufen oder man ordnet einzelnen Microservices dedizierte Server zu.

Welches Web-Service Protokoll man wählt spielt eine untergeordnete Rolle. Eigentlich kann man Transportprotokoll und Datenformat beliebig kombinieren. REST/JSON ist dafür momentan das Mittel der Wahl. Aber SOAP geht auch. XMLRPC war mal sehr verbreitet. Bei Weblin hatten wir oft Key/Value/LF als Datenformat (auch liebevoll SRPC genannt), weil das meistens völlig ausreicht. Es geht aber auch anspruchsvoller, z.B. mit Protocol Buffers als Datenformat. Als Transportprotokoll bietet sich HTTP an. Aber es geht auch plain TCP oder ein Message-Bus.

Bei Weblin hatten wir Microservices für:

  • Userdaten (Identity), vom Frontend bespielt, vom Client benutzt
  • User created content upload (der berühmte File-Service: files.zweitgeist.com)
  • Download-Server
  • Wallet-Service und Punktekonto
  • Topsites-Service
  • XMPP-Server Management Service
  • Unit-Test (System-Runtime-Test) als Web-Service
  • GeoIP Auflösung als Web-Service
  • Kontaktlistenverwaltung
  • Wuscheln, Publisher (alles, was der Client wollte, ich sage nur "srpc.php")
  • VPI-Server
  • Compute-Service (Avatar-Generator)
  • Locatr
  • Ad-Server
_happy_eigenlobing()

3. März 2010

Welcome Neptun

In my spare time I am currently implementing a virtual items management system for the OVW project. During October/November 2009, I programmed a rudimentary item inventory as a Web portal in C#. I am developing on Windows/.NET and run the production server with Linux/mono. (BTW: mono is really great).

Since I had no item server, I implemented a quick web service, which simulated an item server to populate my inventory with dummy items from a dummy web service. This is a view of the inventory as a web page.

The simulator worked directly on the database with lots of SELECTs each time you accessed the inventory page, which shows all items. As we all know, caching helps to take load off the database, but in the case of virtual items, caching is not enough. Virtual items are active. They live. They have timers. They interact, and they change over time. Every single activity invalidates the cache. What virtual items need is an active cache.

That's what I programmed in the last 3 weeks: a specialized cache only for virtual items which can be accessed like a web service via HTTP. The protocol could be SOAP (too heavy) or REST (too unstructured). As you might have guessed, the protocol actually is SRPC.

So, now I have an item server that

  • populates the inventory and
  • serves as an active cache for the database, and
  • has a web service interface.
The thing is called "Neptun". (Yep, I just noticed, that it should be "neptune")

The inventory web page above is for users. Developers can also see items in the item server, but they are much more bare bone there. Neptun has a web user interface. There is a list of item numbers and item properties. Not much for the user, but informative for developers.

Neptun has been developed with lots of unit tests and integration tests (as usual). And here is another proof, that unit-testing is king: when I replaced my dummy item service with the real item server by exchanging just one web service URL, the thing just worked.

_happy_orbiting()

Thanks to Ingo for consulting at the Silpion party.