Re-Implementing the Google Reader API in 2025

published on in category , Tags: rss elixir development

Over Christmas, I decided to finally tackle learning Elixir. Since I learn new programming languages best by actually building something, I set out to implement a simple RSS reader/sync service, similar to Miniflux and FreshRSS.

Since I actually want to use this app, it needs to have a way to connect to my client apps (Reeder 5/Classic on iOS and macOS, as well as NewsFlash on Linux). Since the Google Reader API implementation is well-supported in most RSS clients, I set out to build an API compatible with the existing spec.

As it turns out, it’s hard: The API was never officially documented, and existing implementations rely on reverse-engineered specs. Even though robust open-source implementations exist along with some documentation, getting it working - especially with Reeder - was not easy. I had to do additional reverse engineering to get all the details right. The original API spec dates back to 2005, uses multiple response formats, and is not particularly well-structured compared to modern REST APIs.

Additionally, many reverse-engineered docs suggest implementations that might be compatible with the “original” Google Reader API spec but unfortunately do not work with many clients, which have evolved to support slight deviations of the spec. While this documentation is by no means complete, it should provide you with a functional baseline for the most essential features:

  • Logging in
  • Fetching subscriptions
  • Fetching feed items
  • Changing starred and read status

Below, I’ll do my best to document the most important parts of the API, including all the details needed to get things working.

You can find the almost “full” implementation (with some mock values for now) here: https://github.com/dstapp/feedproxy/blob/main/lib/feedproxy_web/controllers/greader_api_controller.ex

I don’t claim this implementation is complete or even bug-free, but it got me to a state where I can confidently use my RSS clients with it.

Concepts

In general, there are two entities: Feeds and Feed Items. A Feed is a subscription to a given RSS URL, while a Feed Item is a single news item inside the RSS feed.

URLs

Most of the URLs start with /reader/api/0. The 0 is a version parameter, but there was never anything other than 0, so you can hardcode that value.

Authentication

There are two types of tokens that the user receives: a long-lived “auth token” and a short-lived “session token.”

The long-lived token, which the client gets back from /accounts/ClientLogin, is sent in each subsequent request as an Authorization header:

Authorization: GoogleLogin auth={authtoken}

This token remains valid until the user “logs out” (according to reverse-engineered docs). In modern clients, they don’t. So, just provide a token that lets you verify the user’s identity.

The session token is provided via the T request parameter in scenarios where data is being modified (e.g., marking items as read or starring items).

Realistically, we’re usually not dealing with classified material here. This is why existing implementations like FreshRSS and Miniflux don’t even bother generating a short-lived token. In FreshRSS, they generate a SHA-1 hash based on the system salt, username, and respective password hash, making the token effectively permanent, with no expiration.

If any request cannot be authenticated, return a 401 Unauthorized HTTP response.

Streams

A stream describes a “category” of feed items to be returned. When you request data for a specific stream, you get back a list of feed items. These are the existing streams:

  • user/-/state/com.google/starred: Returns feed items that are marked with a star
  • user/-/state/com.google/read: One would expect this to return a list of read (not unread) items, but most clients rely on this to return everything, both read and unread items.
  • user/-/state/com.google/reading-list: Returns all feed items, regardless of their read/starred state.
  • user/-/state/com.google/kept-unread: I expose this, but in my case, it just returns the same as reading-list. I haven’t observed any client requesting it.

Depending on which stream is requested, you may need to adjust your database query to return only the relevant data.

Those strings are not only used as streams, but also as tags for feed items, to indicate they are starred or read, etc. But we’ll tackle that below.

Item IDs

According to the spec, there are multiple ways to represent Feed Item IDs. Clients may use different formats when requesting data, so your implementation must support all of them. When generating API responses, you must use specific representations for different endpoints. Internally, IDs should be stored as integers, as some clients cannot parse UUIDs or MongoDB-style IDs. These are the available representations:

Long-form hex representation: Looks like tag:google.com,2005:reader/item/000000000000001F. It has a fixed prefix tag:google.com,2005:reader/item/ followed by a hexadecimal representation of your integer ID, left-padded with 0 to a length of 16 characters.

Short-form hex representation: I didn’t find this documented anywhere, but at least Reeder sometimes sends the 16-digit padded hexadecimal version without the tag:google.com,2005:reader/item/ prefix, like 000000000000001F.

Short-form decimal representation: The plain decimal value of the internal integer ID.

You’ll need a parser function that can handle any of these ID formats and return the decimal value. Here’s mine:

defp parse_reader_id(id) do
  case id do
    "tag:google.com,2005:reader/item/" <> hex_id ->
      # Handle long-form ID (hex)
      case Integer.parse(hex_id, 16) do
        {int_id, _} -> {:ok, int_id}
        :error -> :error
      end
    hex_id when byte_size(hex_id) == 16 ->
      # Handle short-form hex ID (16 chars, zero-padded)
      case Integer.parse(hex_id, 16) do
        {int_id, _} -> {:ok, int_id}
        :error -> :error
      end
    raw_id ->
      # Handle decimal ID
      case Integer.parse(raw_id) do
        {int_id, _} -> {:ok, int_id}
        :error -> :error
      end
  end
end

Feed IDs

Feed IDs always follow the format feed/{id}. I recommend using integer IDs, though it might work with unique string values. However, for best compatibility with existing clients, decimal integers are the safest choice.

Continuation Tokens

A continuation token is essentially a pagination parameter. You can implement this using timestamps or item counts. For example, if you return 20 items per page, you send 20 items and return a continuation token of 20. On the next request, the client sends 20, so you start at an offset of 20, fetch another 20, and return 40 as the next continuation token.

Dates

Dates and times are generally exposed as UNIX timestamps, either in seconds, milliseconds, or microseconds. They should always represent UTC time. Be sure to convert timezones when fetching feeds.

Also, sometimes they are ints, other times it’s strings. Make sure to get this right as clients are very picky about that.

Output Format

Most endpoints support an output parameter (either as a query param or in the body) that determines the response format. It can be json or xml, but both of my clients always request json. For now, I haven’t even implemented XML.

Endpoints

Below I’ll outline all the details for required endpoints, including some description for the corresponding functionality. There are more endpoints, but as far as I can tell, they are not required and both the clients that I tested worked fine without them, and existing server implementations also don’t seem to implement them.

I’ll put placeholder values in {} and unless otherwise annotated, they are strings.

Authentication

By now, there are multiple ways to do Authentication for Google Reader APIs. While compatible implementations, e.g. in InnoReader seem to do OAuth, this is just the basic way, that should work with any Google Reader API-compatible client.

/accounts/ClientLogin

Authenticates the user based on username and password, and returns a auth token. Apparently you can emit the expires_in field, but I didn’t test that. authtoken is a string and can basically be whatever you need to authenticate and authorize the request.

FreshRSS uses a {email}/ followed by a SHA-1 hash of system salt, email and hashed password.

Endpoint

POST /accounts/ClientLogin

Request

Clients send either application/json or application/x-www-form-urlencoded.

Payload:

Parameter Required Description
Email yes Username
Passwd yes Password
accountType no In my case always HOSTED_OR_GOOGLE
service no In my case always reader
client no Name of the client, e.g. Reeder
output no Output format: json or xml

In reality, anything but Email and Passwd can be ignored for authentication purposes.

Response

Expected response encoding: text/plain

Expected response:

SID={authtoken}
LSID=null
Auth={authtoken}
expires_in={expiry in seconds, e.g. 604800}

/reader/api/0/token

Returns the short-lived session token for the auth token.

Endpoint

GET /reader/api/0/token

Request

Nothing except what’s defined in concepts (auth token).

Response

Expected response encoding: text/plain

Example response:

16eec0206a01dc0cc6f7a362d907bfd2a0b731a1ZZZZZZZZZZZZZZZZ

The token should be padded if necessary to always be exactly 57 characters in size. For more details on what this actually is, see Concepts/Authentication above.

Sends an encoded token back, but in the request you’ll still get the Authorization header, so that you can authenticate and authorize the request.

/reader/api/0/user-info

Returns user info based on the auth token. I have not seen any of that used anywhere, but some clients request it.

Endpoint

GET /reader/api/0/user-info

Request

Query params:

Parameter Required Description
output no Output format: json or xml
Response

Expected response encoding: As per output, I always use application/json

Example response:

{
  "userId": "1",
  "userName": "demo",
  "userProfileId": "1",
  "userEmail": "user@example.com"
}

Subscriptions

Subscription endpoints return a list of subscriptions and tags that are set up in the system. My implementation does not support tagging, so I’m just returning the standard tags required by Google Reader API clients.

/reader/api/0/subscription/list

Returns a list of subscriptions.

Endpoint

GET /reader/api/0/subscription/list

Request

Query params:

Parameter Required Description
output no Output format: json or xml
Response

Expected response encoding: As per the output query param, I always return application/json.

Example response:

{
  "subscriptions": [
    {
      "id": "feed/1",
      "title": "Test feed",
      "categories": [], // String IDs of categories but can be an empty array
      "url": "RSS URL",
      "htmlUrl": "Website URL",
      "iconUrl": "Favicon URL"
    }
  ]
}

/reader/api/0/tag/list

Returns a list of tags. If your system does not support custom tags, at least return the Google API system ones as seen in the example response below.

I don’t implement tagging, but those tags (responding with the streams) are the one that Google Reader API clients expect by default, so you can just return them as I do here.

Endpoint

GET /reader/api/0/tag/list

Request

Nothing special.

Response

Expected response encoding: application/json

Example response:

{
  "tags": [
    {"id": "user/-/state/com.google/starred"},
    {"id": "user/-/state/com.google/read"},
    {"id": "user/-/state/com.google/reading-list"},
    {"id": "user/-/state/com.google/kept-unread"}
  ]
}

Content retrieval

Now this is where things get tricky: There are multiple ways to retrieve actual Feed Items from the API and, you can bet my two RSS clients (Reeder & NewsFlash) use different mechanisms.

NewsFlash uses the /reader/api/0/stream/contents/:streamId endpoint to fetch all data for a given stream directly, while Reeder first fetches a list of ID for the given stream via /reader/api/0/stream/items/ids and then sends batch calls to /reader/api/0/stream/items/contents to load the actual contents for each ID.

/reader/api/0/stream/contents/:streamId & /reader/api/0/stream/items/contents

Returns contents of a given stream, so the actual Feed Items. Multiple endpoints and request formats for compatibility with different clients.

Endpoint

GET /reader/api/0/stream/contents/:streamId ANDPOST /reader/api/0/stream/contents/:streamId (it seems different clients use different methods to fetch it) as well as GET /reader/api/0/stream/items/contents

URL parameters:

  • streamId: The ID of the stream as seen above, e.g.user/-/state/com.google/starred but without URL encoding, so it’s just added as part of the URL (make sure your URL parser can handle that)
Request

Possible request params (either as query params or application/x-www-form-urlencoded in the POST body):

Parameter Required Description
i no One or more individual Feed Item IDs (in any of the three before-mentioned formats). Note that this is not an array or anything. Because of the urlencoded nature, if it’s one item that’s requested, it looks like i=000000000000001F, if there are multiple, you just get the same query param multiple times, like i=000000000000001D&i=000000000000001E&i=000000000000001F. My body parser could not handle that, so I had to get the raw request and split it manually
xt no “Exclude target”, so a Stream ID, or tag, that a Feed Item must not have to be returned
n no Amount of items to request
ot no “Older than”, UNIX timestamp to get increasingly older items
c no “Continuation token”, so starting from item x for pagination. See above for more info.
output no Output format, could be “xml” or “json”. Defaults to json

Feed items should be returned by publishing date descending, so starting from the newest one for the query param ot to make sense.

Response

Expected response format: Based on the output. I always return application/json.

Example response:

{
  "id": "user/-/state/com.google/reading-list", // requested stream
  "updated": 12312412, // UNIX timestamp of now as an int
  "items": [
    {
      "id": "tag:google.com,2005:reader/item/000000000000001F", // long form ID
      "title": "Example post",
      "published": 123123123, // UNIX timestamp as int
      "crawlTimeMsec": "123123123", // UNIX timestamp crawl time in milliseconds as string
      "timestampUsec": "123123123000",  // UNIX timestamp crawl time in microseconds as string
      "alternate": [{
        "href": "https://example.com/post" // url to the post
      }],
      "canonical": [{
        "href": "https://example.com/post" // same as alternate
      }],
      "summary": {
        "content": "" // excerpt or post body
      },
      "categories": [ // categories, but at least the standard tags that apply for the given post
        "user/-/state/com.google/reading-list",
        "user/-/state/com.google/read",
        "user/-/state/com.google/starred"
      ],
      "origin": {
        "streamId": "feed/3", // ID of the Feed
        "title": "sample feed",
        "htmlUrl": "https://example.com"
      }
    }
  ]
}

Please make sure to have the given timestamps as int or string respectively, as pointed out in the example response.

/reader/api/0/stream/items/ids

Returns a list of item IDs for the given stream. Same logic as above as far as data fetching goes, but we only return the IDs as pure decimal values.

Endpoint

GET /reader/api/0/stream/items/ids

Request

Possible request params:

Parameter Required Description
s yes The ID of the stream as seen above, e.g.user/-/state/com.google/starred
n no Amount of items to request, up 10000
ot no “Older than”, UNIX timestamp to get increasingly older items
xt no “Exclude target”, so a Stream ID, or tag, that a Feed Item must not have to be returned
it no “Include target”, inverse of xt
r no If set to o -> reverse sort
output no Output format, could be “xml” or “json”. Defaults to json
Response

Expected response encoding: According to output, for now I always use application/json.

Example response:

{
  "itemRefs": [
    { "id": "1" },
    { "id": "2" },
    { "id": "3" }
  ]
}

/reader/api/0/unread-count

Returns the unread count for each of the feeds individually, including the tmiestamp of the newest item. If all=1, then you might have Feeds with a count=0 in your list.

Endpoint

GET /reader/api/0/unread-count

Request

Possible query params:

Parameter Required Description
all no Include feeds with no unread items if 1 is provided as a value
output no Output format, could be “xml” or “json”. Defaults to json
Response

Expected response encoding: According to output, I always use application/json.

Example response:

{
  "max": 5, // integer representing the sum of all "count"s
  "unreadcounts": [ // one item for each Feed
    {
      "id": "feed/1",
      "count": 5, // int representing count of unread items
      "newestItemTimestampUsec": "1231231230000" // UNIX timestamp in microseconds of the newest item as string or "0" 
    }
  ]
}

State manipulation

Changing read and starred states is generally done through the edit-tag endpoint. That supports single and batch items through the i parameter (see above). Additionally, there’s a mark-all-as-read endpoint.

/reader/api/0/edit-tag

Edits a tag given in the request, so marking as read/unread or starred/unstarred.

Endpoint

POST /reader/api/0/edit-tag

Request

Request encoding: application/x-www-form-urlencoded

Request params:

Parameter Required Description
i no One or more individual Feed Item IDs (in any of the three before-mentioned formats). Note that this is not an array or anything. Because of the urlencoded nature, if it’s one item that’s requested, it looks like i=000000000000001F, if there are multiple, you just get the same query param multiple times, like i=000000000000001D&i=000000000000001E&i=000000000000001F. My body parser could not handle that, so I had to get the raw request and split it manually
a no Tag ID to add (url-encoded)
r no Tag ID to remove (url-encoded)

Possible Tag IDs:

  • user/-/state/com.google/read
  • user/-/state/com.google/starred
Response

Response content type: text/plain (Reeder needs that)

Response:

OK

This endpoint must return a plain-text “OK” for some clients to work, while others don’t care apparently.

/reader/api/0/mark-all-as-read

Marks all items in a stream as read, starting from the given timestamp.

Endpoint

POST /reader/api/0/mark-all-as-read

Request

Request encoding: application/x-www-form-urlencoded

Request params:

Parameter Required Description
s yes Stream ID to update
ts yes UNIX timestamp to start from
Response

Response content type: text/plain (Reeder needs that)

Response:

OK

This endpoint must return a plain-text “OK” for some clients to work, while others don’t care apparently.