Converting my website to AMP (Accelerated Mobile Pages)

Introduction

Viewing web pages on a mobile device (who are we even kidding, on bigger devices too) is a pretty awful experience. Everything takes a long time to load and shifts around while doing so, page layouts are often entirely broken on phones, and there are a bunch of obnoxious interstitial or otherwise obtrusive ads all over the place. In October 2015, Google announced the Accelerated Mobile Pages Project (hereafter "AMP", but also try using "amphtml" in web searches), a restricted subset of HTML5 that requires conformant pages to follow practices that guarantee they can be loaded and rendered quickly by mobile devices. (There's no reason you can't also use AMP to make pages for beefier non-mobile devices, although the practice doesn't seem to be common yet.) Among other restrictions, this is accomplished by disallowing external resources like JavaScript and CSS files that can block the rest of the page from being rendered, and by requiring pages to define the sizes of things like images upfront so they can be loaded asynchronously without shifting other elements around. There's also a Google-run CDN called the AMP Cache that promises to automatically cache and serve all valid AMP documents quickly.

At first glance, AMP doesn't seem all that different from Facebook Instant Articles or Apple News Format (why do you hate the word "the" so much, Apple?). Both of those standards have the stated goal of making it more pleasant to read articles on mobile devices, too. Facebook Instant Articles are also HTML5-based, and Instant Articles and AMP basically even have the same logo (Facebook Instant Articles logo vs. AMP logo ). Apple News Format seems like a weird JSON-meets-Markdown thing, although I'm having trouble finding a simple example document.

But there are some big differences in how the standards are used. AMP pages can be served directly from your personal server and displayed by any modern browser (since they're just regular HTML5 pages with some AMP-provided JavaScript and CSS). Facebook Instant Articles and Apple News articles can only be viewed using Facebook's mobile apps or the Apple News app, respectively, and Facebook and Apple ultimately get to decide what gets published; the formats are designed to improve the article-reading experience on those closed platforms but don't do anything to improve the open web. Google searches are one way to get to the AMP versions of pages, but AMP is also used by Twitter, Pinterest, and LinkedIn, and Microsoft even announced that they were adding support to the Bing App in September 2016.

After reading about AMP, I decided to take a stab at creating AMP versions of the pages on my personal site (where you're probably reading this). This page has a list of the main requirements for an AMP document. The ones that stuck out the most to me are:

  • Pages need to include AMP boilerplate CSS via <style> and contain an AMP <script> element.
  • All custom CSS rules need to be inlined in a single <style> element in <head> (setting per-element CSS via style attributes isn't permitted).
  • No custom JavaScript is allowed. Depending on the page, this maybe isn't as bad as it sounds — there's a big collection of custom elements that are implemented via web components.
  • <img> is replaced with <amp-img>, which can apparently do all kinds of fancy lazy-loading and transcoding tricks. width and height attributes need to be present so the page can be laid out before any images have been downloaded.
  • AMP pages need to link to a (possibly-non-AMP) canonical URL.
  • There are some more easy, miscellaneous requirements, like including an amp attribute in the <html> element. (You can instead include a character (U+26A1, "HIGH VOLTAGE SIGN"), which seems like a great idea if you're the sort of person who hates yourself and wants to constantly copy-and-paste that symbol instead of just typing amp.)

My site

So, taking stock of the current state of my site: It's aggressively static. There's no database or server-side scripting, and I generate all of the HTML files locally before copying them to my webhost. To do that, I have a Ruby script named template_lib.rb that defines functions like start_page (returns <html><head>...</head><body>), end_page (returns </body></html>), and a few other helpers like start_box and end_box to emit containers. Each page on my site has a .rhtml file that looks similar to this:

<%= start_page("Page title", [other args]) %>

<%= start_box("Hello", [other args]) %>
<p>Here's some content!</p>
<p>(I never said it was interesting content.)</p>
<%= add_image("image.png", 1024, 768, [other args]) %>
<%= end_box %>

<%= end_page([args]) %>

Each <%= ... %> tag gets replaced by the output of the Ruby code inside of it. There's also a build script that checks if there's an up-to-date .html file for each .rhtml file, and if not, generates one using ERB with a command like:

erb -r ./template_lib.rb my_page.rhtml >my_page.html
Screenshot of desktop site
A page viewed on desktop

Most pages have almost no JavaScript. On desktop, there's a fixed-position site tree (called navbox in my CSS rules) on the left side of the page. On mobile, the navbox appears at the top of the page and there's a button that can be clicked to expand and collapse it using some trivial JavaScript that changes an element's CSS classes to trigger transitions.

template_lib.rb also contains the site hierarchy and uses it to generate the navbox. The hierarchy is rendered differently for different pages: if the current page contains multiple headings or subpages of its own, that portion of the tree is expanded.

100/100 score for non-AMP page on Google PageSpeed Insights
Hooray, I win... nothing?

There's a base set of CSS rules, and then separate sets for desktop and mobile that are hidden behind @media(min-width:641px) and @media(max-width:640px) media selectors. I got on a kick recently about trying to get a perfect score from Google PageSpeed Insights, so I'd already made some fortuitous changes like inlining these CSS rules into each page instead of loading them as external files. I'd also dropped Google Analytics, which makes you lose a point due to loading a JavaScript file that has a cache expiry of just two hours. (There's the added bonus that not tracking users seems like a nice thing to do.)

Remember when I said "almost no JavaScript"? That's true for most of the pages, but there are two that are more complicated: my glucometer page, which includes a lot of SVG graphs rendered by d3.js, and my pullup-bars-in-San-Francisco page, which has an embedded Google Maps map with custom markers and a bit of custom JS. I decided I'd just skip those for now.

Screenshots of mobile page with menu collapsed and expanded
A mobile page with the menu collapsed and expanded

Comparing the current site to AMP's list of rules, it was clear that the mobile navbox behavior would need to change: pages can't use custom JS to set classes to expand or collapse the navbox. I skimmed through some lists of AMP components and discovered amp-sidebar. It seemed like it'd be a better user experience than what I have right now on mobile, so let's use it!

It was also clear that I'd need to mush all of my custom CSS together into a single <style> element instead of having separate elements for the base, desktop, and mobile rules. While I was at it, I figured I should probably drop the desktop rules entirely when generating AMP pages and also omit the media selector around the mobile rules for AMP — otherwise, an AMP page loaded in a desktop browser would be missing a bunch of styles.

Converting one page by hand

To get started on the conversion, I made a copy of a simple .html page, inlined the base and mobile CSS rules, added amp to the <html> element, and loaded it in Chrome with #development=1 on the end of the URL and the JS console open to see what's broken.

Wow, a lot of stuff is broken!

AMP displays pretty decent error messages for validation errors, so I:

  • changed <img> tags to <amp-img> and added closing </amp-img> tags,
  • saw a ton of errors about the navbox and just commented out its HTML and CSS for now,
  • added minimum-scale=1 to the <meta name="viewport"> tag (it already had initial-scale=1),
  • dropped style="background-position-x: ..." attributes from some elements,
  • dropped style="width: ..." attributes from image containers,
  • removed a DOM-loaded <script> tag that wired up toggle-navbox event listeners,
  • moved the logo to the top of the page instead instead of above the navbox,
  • noticed that amp overrides <body>'s margin, setting it to 0, and instead set a margin on the top-level container element,
  • and made a bunch of other miscellaneous CSS additions.
JavaScript console showing AMP validator

It validates!

Screenshots of AMP page with hidden and shown sidebar
An AMP page with the sidebar hidden and shown

The next step was adding an <amp-sidebar> to bring back the navigation links. This was surprisingly straightforward: I added a <script> before the main v0.js AMP <script> to pull in the sidebar JavaScript, moved the navbox <div> within a new pair of <amp-sidebar> opening and closing tags, and then added an <amp-img> (temporarily using my old expand/collapse triangle icon as a placeholder) with an on attribute telling it to open the navbox when clicked:

<amp-sidebar id="sidebar" layout="nodisplay" side="right">
  <div id="navbox">
    ...
  </div>
</amp-sidebar>

<div id="top-container">
 ...
  <amp-img id="menu-button" src="collapse.png" alt="menu"
           width="19" height="17" tabindex="0" role="button"
           on="tap:sidebar.open"></amp-img>
</div>

Updating the workflow

I felt like I had a pretty good grasp of the sort of changes that I'd need to make to get most of the pages validating now, so I turned toward adding this stuff to template_lib.rb. Some of it seemed desirable for non-AMP pages as well, like adding minimum-scale=1, so I did that first. For the rest, like using <amp-img> rather than <img>, I knew I'd need to make template_lib.rb produce different output depending on whether it was generating an AMP page or a non-AMP one. An environment variable seemed like the easiest way to pass data into ERB, so I updated the build script to run ERB a second time for each .rhtml template with AMP=1 prepended to the beginning of the command line, like:

AMP=1 erb -r ./template_lib.rb my_page.rhtml >my_page.amp.html

In template_lib.rb, I set a global variable at the top of the file using:

$amp = ENV['AMP'] == '1'

and then later did stuff like:

def add_image(...)
  img_tag = $amp ? 'amp-img' : 'img'
  img_end = $amp ? '</amp-img>' : ''
  ...
  "<#{img_tag} ...>#{img_end}"
end

I spent roughly forever trying to figure out why this wasn't working when I first used $amp = ENV['AMP'] == 1. Then I remembered that Ruby, unlikely JavaScript, doesn't coerce strings into integers or vice versa for comparisons. I think I might've even followed this up with still more confusion by using $amp = ENV['AMP'] and running AMP= erb ... for the non-AMP pages, forgetting that empty strings, the string "0", and even the integer 0 evaluate to true in Ruby. Sigh.

Once I'd fixed that, I integrated the rest of my changes into template_lib.rb. This included adding AMP's boilerplate CSS rules, along with creating a new file containing my own AMP-specific CSS rules (inlined into my big <style> tag by template_lib.rb) and updating many of the script's functions to check $amp and produce appropriate output. There was also a bit of trickiness around using the correct order for the elements in my <head> tag; AMP has strict rules around this.

Notable tricky bits

One possibly-interesting change I had to make involved places where template_lib.rb was setting random background-position-x inline styles on some elements to vary their appearance. AMP doesn't allow style attributes on elements, so I had to find some other way of doing this. With the help of StackOverflow, I realized that I could approximate the current look (with periodic repetition) by using the :nth-of-type pseudo-class to vary the position used on successive elements:

.contentbox:nth-of-type(8n) .boxtop {
  background-position-x: 0;
}
.contentbox:nth-of-type(8n+1) .boxtop {
  background-position-x: 1024px;
}
...
.contentbox:nth-of-type(8n+6) .boxtop {
  background-position-x: -768px;
}
.contentbox:nth-of-type(8n+7) .boxtop {
  background-position-x: 256px;
}

Some unexpected work came from AMP's requirement that AMP pages link to their content's canonical (usually non-AMP) URLs using a <link rel="canonical"> tag, while the canonical pages link to the AMP URLs with <link rel="amphtml"> — this gives search crawlers a way to discover AMP pages. I was using relative URLs throughout my site, and template_lib.rb didn't have any idea of what an absolute, canonical URL should even look like. I had to add some more logic to the script to make it able to generate links in both directions. I should mention here that there's an AMP URL API that you can use to request both cached- and non-cached AMP copies of a given canonical URL, but it seems like it's targeted at sites and apps like Pinterest or Twitter that might want to rewrite tons of external links to point at faster-loading AMP versions instead.

I'm able to get an idea of how often my regular pages are being loaded by looking at my web server's logs, but AMP pages might end up getting served directly from Google via the AMP Cache. I decided to add Google Analytics tracking back to the AMP pages so I can at least see if anyone's loading them. Luckily, this was basically no work at all: You can add <amp-analytics> elements that send pings in response to various events (in my case, page load). For Google Analytics, I just copy-and-pasted the "page tracking" example from here and inserted my Analytics-supplied property ID.

So at this point, there are three ways to load my pages. Taking the main index.html page as an example, you can use:

  • https://www.erat.org/index.html, the canonical non-AMP version served by my webhost
  • https://www.erat.org/index.amp.html, the AMP version served by my webhost
  • https://cdn.ampproject.org/c/s/www.erat.org/index.amp.html, the AMP version served by Google from the AMP Cache

The CDN version ought to load quickly, but that URL is really ugly. There's also a big question that you should ask whenever using someone else's version of a URL instead of your own canonical URL: how long will this keep working? I know that www.erat.org URLs will work as long as I keep paying the bills, but Google has... uh... a not-so-stellar track record around supporting products that don't make money. Luckily, someone asked about this on Stack Overflow. The circa-July-2016 answer there was, "We recommend that people link to the canonicals[,] not to the Google AMP Cache versions of their pages." In my case, I think that "canonicals" refers to my non-AMP canonical URLs — I don't want to link directly to my AMP pages since they won't look right for desktop users (although this might change if or when I update the AMP pages to look reasonable on larger devices). There's been more discussion since then on the linked Twitter conversation, and there are now AMP Cache Guidelines that strongly suggest that it should be safe to link to CDN URLs:

4. [An AMP cache] pledges to maintain URL space forever (even beyond the lifetime of the cache itself):
i. This can be achieved by donating the URL space to a trustworthy third party entity such as archive.org.
ii. This means that, should a cache decide to no longer operate, URLs should redirect to the origin URL or be served by another cache.
Screenshot showing missing image
What's that blank space doing there?

But there's something else that makes me reluctant to depend on the Google AMP Cache: I don't have any recourse when things break. If the cache goes down entirely, or starts serving a stale or incorrect version of my page, or decides that it no longer validates, or whatever, there's nothing I can do about it. I actually ran into this immediately after uploading my AMP pages and loading them via the cdn.ampproject.org URLs: the orange image across the top of my content boxes was missing. From the dev console, I could see that the AMP Cache was returning a 404 error instead of the image. This was especially weird since other images served by the cache (and even the similar blue image across the top of the navbox!) were working. I submitted a bug report, and it turned out to be a problem with the way that the cache tries to resize extremely-wide-and-short images (the orange image was 4096x2). But it stayed broken for weeks, which wasn't confidence-inspiring.

Screenshot showing solid color in place of missing image
Slightly better...

In the meantime, I set the image's background-color style (which I should've done in the first place) so visitors at least see a solid orange color when loading pages from the cache. Then I gave up and switched to a differently-sized image that didn't trigger the bug.

I eventually decided to stick to the following practices:

  • When I link to one of my pages from somewhere else, I should use the non-AMP version hosted at my domain, e.g. https://www.erat.org/foo.html.
  • My non-AMP pages can continue to link to other pages and images, JS files, etc. from my domain using relative URLs like /bar.html or /resources/some_image.png.
  • My AMP pages should use relative URLs in <amp-image> elements (since I want the AMP Cache to be able to rewrite those to be served from the cache), but they should use absolute URLs like https://www.erat.org/bar.amp.html (note that I'm linking to the AMP versions of other pages) and https://www.erat.org/files/something.tar.gz when linking to other pages or downloadable files within the domain. The AMP Cache seems to try to rewrite relative URLs intelligently when it's serving pages, but I figure it's safer to do this myself instead of relying on whatever heuristics it's using.

To make these rules easier to follow, I added a helper function to template_lib.rb that rewrites a given relative URL appropriately depending on whether it's generating an AMP or non-AMP page and then updated all of my .rhtml files to call it instead of including relative URLs directly.

There's actually one other option here: I could configure my web server to detect mobile user agents and redirect them to the AMP version of the page automatically. This can result in pretty frustrating moments for users, though: how many times have you ended up seeing the mobile version of a page on a desktop machine after following a shared link, or struggled to convince a site to serve the full desktop version of a page to your phone instead of some useless scaled-down mobile version? Additionally, a redirect forces an extra network round-trip before the client can render anything (although HTTP 2.0 server push may change this). Google has a page describing some best practices for serving separate mobile content, but the <link rel="amphtml"> tag already makes it likely that Google will direct mobile users to the AMP versions of my pages, so I decided not to try to do anything tricky.

Validation

My build script already validates each page that it generates by uploading it to https://validator.w3.org/ with a command like:

curl --silent --show-error \
  -F 'action=check' \
  -F 'uploaded_file=[local file]' \
  http://validator.w3.org/check

and looking for the text "This document was successfully checked" in the output. That seems to work fine for HTML5 documents, but it'd be nice to do something more to check that the AMP pages are abiding by all the additional rules that are present there — this is even more important since the AMP Cache will only cache valid documents. There's an online AMP validator, but it seems oddly geared toward validate-as-you-type use, and it wasn't clear to me whether there's any way to upload a local file to it. The only way I could find to validate local files was by using the amphtml-validator Node.js package as described in the "Command Line Tool" section of the Validate AMP Pages doc. I wouldn't call it blazingly fast, but it seems to get the job done. Note that the AMP validator currently reports failure when passed AMP Cache URLs.

Iframes!

I had AMP versions of almost all of my site's pages now, but there were still those two holdouts: the glucometer page with the d3.js <svg> graphs and the pullup-bars page with the embedded map. For the glucometer page, it was pretty clear there was no way I was going to get the graphs working within the AMP page — d3.js is a JavaScript library, after all. I could've just screenshotted the graphs and used static images in the AMP page, but after I saw that AMP supports iframes via the <amp-iframe> element, I decided to try to move the graph-rendering code into a new non-AMP HTML file and embed it into the AMP page.

After searching for terms like "responsive svg", I found some pages like this one that were super-helpful for setting the right styles on my <svg> elements to make them scale to their containers' widths. I eventually ended up with this in the AMP page:

iframe.graph {
  background-color: transparent;
  border: solid 1px #ddd;
  padding: 0;
  overflow: hidden;
}

<amp-iframe class="graph" width="300" height="200"
    layout="responsive" frameborder="0" sandbox="allow-scripts"
    src="https://www.erat.org/iframes/glucometer_graph.html?w=300&h=200&g=historical">
</amp-iframe>

I put an absolute URL in the src attribute since the framed page isn't AMP-compliant and won't get cached by the AMP Cache. I strongly suspect that the cache is already built to handle this case by rewriting links to non-cached framed pages, but I decided to be explicit instead of counting on that. It does make development a bit trickier, though, as local copies of the main page still embed the online iframe page.

glucometer_graph.html contains some JavaScript that extracts the w, h, and g query parameters from window.location and renders the requested graph at the requested size. (I believe the size is needed to determine the graph's aspect ratio, which should remain fixed even when the graph is scaled.) This page uses the following CSS to make its SVG element take up all the space allotted to it:

body {
  margin: 0;
  overflow: hidden;
}

svg.graph {
  width: 100%;
  height: 100%;
  display: inline-block;
  position: absolute;
  background-color: white;
}

Some additional attributes are set on the <svg> element when d3.js adds it when the page is first loaded (width and height here are the dimensions that are used for the graph on the non-AMP page and that define the bounds used when placing elements in it):

var svg = d3.select(selector)
    .append("svg:svg")
    .data(...)
    .attr("preserveAspectRatio", "xMinYMin meet")
    .attr("viewBox", "0 0 " + width + " " + height)
    .attr("class", "graph");

This seems to mostly work. The iframe maintains its aspect ratio while expanding to fill the page width, and the SVG element within it scales the position and dimensions of all of its elements accordingly. AMP is happy since the loading and execution of this slow, bulky JavaScript doesn't block the main page — in fact, I even get a animated spinner in each iframe if I scroll down to it before it's ready.

I'm punting on making an AMP version of the other page with the embedded map, because I'm not quite sure how or even if I'll be able to make the map interact with the rest of the page. It's likely that I'll need to pare down the functionality of the page in its AMP version.

Structured data

"Add required structured data to your AMP pages" email

At this point, I felt like I was pretty much done with the converting-to-AMP effort, but I got email from Google (by virtue of having registered my site in the Search Console) complaining that my site was missing Schema.org structured data to make it easier to index. There's a strong suggestion at the bottom of that AMP basic markup page to include this, too: it's apparently a requirement to be considered for inclusion in the Google Search news carousel. The chance of that happening for any of my pages seems like it's basically zero, but I figured I'd go ahead and add structured data in the JSON-LD format anyway.

The AMP Project provides a listing of metadata examples, including this one that's pretty much exactly what I wanted. I didn't see this at first, so I instead spent too long reading about JSON-LD and the fields that Google wants and exploring the Schema.org people's bizarre taxonomical urges to drive them to try to classify everything in the world. All that I actually needed to do was to add some code that includes a JSON blob like this in each page's <head>:

<script type="application/ld+json">
{
  "@context": "http://schema.org",
  "@type": "Article",
  "mainEntityOfPage": "https://www.erat.org/sleep.html",
  "headline": "How to Get More Sleep",
  "description": "Tips for sleeping longer",
  "datePublished": "2011-07-06",
  "dateModified": "2011-07-17",
  "author": {
    "@type": "Person",
    "name": "Daniel Erat",
    "email": "dan-website@erat.org"
  },
  "publisher": {
    "@type": "Organization",
    "name": "erat.org",
    "url": "https://www.erat.org/",
    "logo": {
      "@type": "ImageObject",
      "url": "https://www.erat.org/resources/erat-125-20160914.png",
      "width": 125,
      "height": 45
    }
  },
  "image": {
    "@type": "ImageObject",
    "url": "https://www.erat.org/files/sleep/shades.jpg",
    "width": 1024,
    "height": 768
  }
}
</script>

I hadn't kept track of when I originally wrote some of these pages, so I had to do a lot of git spelunking and just guess in some cases. I also don't have 696-pixel-wide images for most of them, so I just left out the image section for those. Google's Structured Data Testing Tool was helpful for checking that I didn't mess anything up.

Note that adding lots of data above the <link rel="amphtml"> tags on your canonical pages can apparently result in your AMP pages not being found by Google, so don't do that.

Wrapping up

So, that mostly wrapped up the effort. I'm still a bit ambivalent about some aspects of AMP and hesitant to go AMP-only. There are users with older browsers that don't support recent standards like web components, and while I'd expect much of the content on my pages to still render for them, tags like <amp-sidebar> and <amp-image> may just get silently dropped. AMP allows <noscript> tags in some places to handle the no-JavaScript scenario, but I'm concerned about the middle ground of browsers that support JavaScript, but poorly.

I'm also ceding some control of my site, both by depending on AMP JavaScript files that could change at any time (I've noticed a few issues around sidebar behavior; one was fixed quickly and the other is still being discussed) and by letting my site be mirrored by the AMP Cache (remember the earlier image-serving bug?). For the former issue, if the canonical AMP JS files break horribly, I guess I can always switch to using my own locally-hosted copies from an older version. There's a good chance I wouldn't even notice if my site breaks due to an upstream change, but one could also make the argument that browser vendors are always pushing new versions of their code that could break my site as well.

89/100 score for AMP page on Google PageSpeed Insights
Now 11 slower than before!

Amusingly (I guess... ?), the 100/100 mobile speed score I'd managed to get from Google's PageSpeed Insights tool actually fell to 89/100 in the AMP versions of my pages. This seems to be due to the pages not being renderable until the AMP JavaScript is loaded (which is kind of the whole point of AMP) and the AMP CDN serving the scripts with a 50-minute cache timeout, shorter than one-week-to-one-year that PageSpeed Insights recommends for static-ish content. I think the idea is that AMP will be (or maybe already is) ubiquitous enough that browsers will typically already have AMP's JavaScript files cached when loading a page, obviating the need for blocking on additional network requests. I suppose it's a good thing that PageSpeed Insights doesn't carve out exceptions for AMP, although it feels weird that my score dropped when adopting a standard that's trying to adhere to best practices for creating mobile-friendly sites.

This touches on another criticism of AMP that I've seen: it's entirely possible to get all of the advantages of AMP (with the possible exception of preloading), or to potentially achieve even better performance, without actually using AMP. As an extreme example of this, consider a simple HTML page with no JavaScript and no external resources. It'd be unfortunate if search engines started favoring AMP pages over just-as-fast non-AMP pages. So far, this hasn't happened.

The AMP tech lead's stance on the mailing list seems reasonable to me: "Its true that super trivial pages are faster than trivial AMP pages, but we find that most real world pages do not fall into the former bucket. AMP goes beyond raw performance. It e.g. supports a prerender mode that gives both privacy guarantees and saves CPU and bandwidth. Super fast non-AMP pages do not support such a mode (at least cannot be provably known to support it)." You can find more discussion about the politics of AMP scattered across the web, of course.

Looking at this as a user, AMP seems great. I preferentially click links with the AMP logo next to them due to how quickly they load and how much less obtrusive the advertising on them is. For example, compare the non-AMP version of this Los Angeles Times article against the AMP version (note that you'll need to follow the AMP link on a phone or with a mobile user-agent lest you get redirected to the non-AMP page). The non-AMP version takes a long time to load, and shifts around, and displays an interstitial ad covering the whole page, and then includes an auto-playing video that follows me when I try to scroll past it! In contrast, the AMP page loads quickly and just includes a few tasteful ads.

And even as a developer, AMP seems like a good thing to me. Just doing some simple benchmarking of my graphics page from my home network, I see similar times to load the main HTML file and do the first paint (likely because I'd already optimized the page). The AMP version sometimes takes a bit longer to completely finish loading the page; I suspect that this is due to below-the-fold or background scripts that don't impact the user experience. The story changes when I load the AMP Cache's copy. Now, the HTML file frequently loads in 100 milliseconds, the first paint happens before the 200-millisecond mark, and the whole thing is often done in less than half a second. I'm essentially getting a Google-run CDN for free.

Page Got HTML First paint Loaded
non-AMP 260 320 650
AMP 260 320 800
AMP Cache 100 170 400

After I did all of this, I came across a page describing someone else's experiences generating an AMP version of their website. They don't seem too different from my own.

To summarize, I like AMP and think you should use it! If you give it a try, here are some useful resources:

Oh, and you can view the AMP version of this page. (It'll look funny if you aren't on mobile.)

Addendum: HERE COMES A NEW IFRAME (2016-10-27)

I was feeling ambitious and/or cocky, so I decided to try to create an AMP version of that one last page with the embedded map. Here's a high-level overview of its original state:

  • There's a <script> element that loads the Maps JavaScript API.
  • There's a chunk of my own JavaScript that uses the API to embed a map in the page and place a bunch of markers on it.
  • Below the map, there are blurbs describing the various featured locations.
  • When a map marker is clicked, an info bubble is displayed containing a #some-blurb-id link that jumps down to the corresponding blurb.
  • Each blurb has a link of its own that calls a JavaScript function that scrolls the page back up to the embedded map and selects the corresponding marker.

There's no place for custom JavaScript in AMP, so it seemed obvious that I'd need to use the same iframe-based approach that I used for the glucometer page. It was straightforward to move all of the page's JavaScript into a new page and embed it using an <amp-iframe> element (which maybe loads a tiny bit slower than embedding the map directly into the main page). The tricky part was going to be preserving (or at least approximating) the existing clicking-on-map-marker-scrolls-main-page, clicking-on-link-in-main-page-selects-map-marker behavior.

First, I decided to try to get this working again on the non-AMP version of the page. Per the same-origin policy, the framed page and the main page are able to access each other as long as they're served from the same place. The framed page can reach out using window.parent or window.top, so I made the map info bubble set window.top.location.hash to scroll the main page to a given blurb. Going in the other direction, I decided that it'd be cleanest if the main page uses postMessage to send objects to the framed page — this works even across origins. I ended up with the following in the main page:

// Wire up links to post messages to the iframe to activate markers.
document.addEventListener('DOMContentLoaded', function() {
  var iframe = document.getElementById('map');
  var anchors = document.getElementsByClassName('map-link');
  for (var i = 0; i < anchors.length; i++) {
    var a = anchors[i];
    var id = a.parentElement.parentElement.id;
    var f = iframe.contentWindow.postMessage.bind(
        iframe.contentWindow, {id: id}, '*', []);
    a.addEventListener('click', f, false);
  }
}, false);

The framed page then listens for messages that get posted to it:

window.addEventListener('message', function(e) {
  selectPoint(e.data.id, true)
}, false);

This worked, so it was time to look at the AMP version. The JavaScript in the main page had to go when $amp is set; I instead made the blurbs link to #map to just scroll the map into view without selecting any markers. On the iframe side, I initially tried to continue updating window.top.location.hash, or even window.top.location.href after reading the old href and appending the fragment. AMP requires iframes to be sandboxed, so to convince the browser to permit this, I needed to set allow-top-navigation, along with allow-same-origin... which AMP explicitly forbids when AMP pages and framed content are served from the same origin. The reason for this makes sense: an AMP page might get served from a cache, in which case it'll no longer be coming from the same origin as the framed page — it seems reasonable to block this upfront so developers don't get confused later when their cached pages don't work.

I discovered that I was still able to assign to window.top.location.href from the framed page without allow-same-origin, just not read from it. Assigning to location.hash seems like it's treated the same as reading from location.href (maybe that's how it's implemented?), so I wasn't able to do that. And to make the iframe construct a main page URL containing a fragment, I realized I'd somehow need to pass it the main page URL — otherwise, it wouldn't know if it should navigate to https://www.erat.org/pullups.html#foo or https://www.erat.org/pullups.amp.html#foo. I tried passing the outer URL to the framed page via a query string:

<amp-iframe id="map" width="640" height="480"
    layout="responsive" frameborder="0" 
    sandbox="allow-scripts allow-top-navigation"
    src="https://www.erat.org/iframes/pullups_map.html?https://www.erat.org/pullups.amp.html">
<amp-iframe>

In the framed page, I grabbed this URL using:

var pageUrl = window.location.search.substring(1);

and then later scrolled the main page with:

window.top.location = pageUrl + '#' + id;

This mostly works, but when the page is served by the CDN (or by a local server I'm running for development), this code navigates back to the copy of the page on my webhost. The outer page can't pass its actual location from window.location because doing so would require JavaScript.

This StackOverflow answer, suggesting using document.referrer, offered a glimmer of hope. I was initially concerned that the referrer wouldn't be passed to the framed page, but the following returns the outer URL (minus any fragment) regardless of whether it's on my webhost, the CDN, or a local server:

var pageUrl = document.referrer.split('#', 1)[0];

So this brings the AMP page almost to parity with the non-AMP version: clicking links in blurbs jumps up to the map without selecting markers, but the more-important links in the map scroll down to the corresponding blurbs as before.

When I tried to load the AMP page on a laptop instead of a phone, I ran into another restriction:

amp-iframe may not appear close to the top of the document (except for iframes that use placeholder as described below). They must be either 600px away from the top or not within the first 75% of the viewport when scrolled to the top – whichever is smaller.

Well, that seems straightforward enough. I loaded the page on a laptop with a high-DPI display and took a screenshot of the map. Then I added an <amp-image> element with a placeholder attribute between the <amp-iframe> tags:

<amp-img layout="fill" placeholder
    src="files/pullups/map_placeholder-1448.png"
    width="640" height="480"
    alt="[map placeholder]"></amp-img>

I decided it'd also be nice to show a placeholder on the non-AMP page, so I first styled the <iframe> in the outer page to display the image before its contents show up:

.mapbox iframe {
  ...
  background-image: url('/files/pullups/map_placeholder-1448.png');
  background-size: 100% 100%;
}

Within the framed page, the map itself can take a while to load, so I also styled the <body> to display the placeholder image:

body {
  margin: 0;
  overflow: hidden;
  background-image: url('/files/pullups/map_placeholder-1448.png');
  background-size: 100% 100%;
}

The map object apparently fires an event once it has fully loaded and become idle, so I made it initially hidden:

#map-div {
  width: 100%;
  height: 100%;
  display: inline-block;
  visibility: hidden;
  position: absolute;
}

#map-div.loaded {
  visibility: visible;
}

Here's the JavaScript that makes the map visible once it's ready:

// Only show the map once it's fully loaded.
google.maps.event.addListenerOnce(map, 'idle', function() {
  mapDiv.className = 'loaded';
});

This works pretty well! The real map's scale can be different from the placeholder's depending on the browser's display size, and there's a chance that users might try to interact with the placeholder and get confused, but it's still less obtrusive than seeing a gray box when the page is first loaded.

The AMP version of the pullup-bars page is here.

Addendum: Google not indexing AMP pages (2016-10-29)

I noticed that Google stopped linking to the AMP versions of my pages when I searched using the AMP search preview from a phone (or when I did a regular mobile Google search — AMP pages were added to the main mobile search index in September 2016, so the demo site is probably no longer necessary). This was weird, because it was definitely displaying them with the AMP logo adjacent before. My AMP pages were still validating and being returned by the AMP Cache, and the Search Console reported that they were indexed and even receiving traffic, so I was at a loss for why they weren't being returned as results.

I found this then-unanswered StackOverflow question describing a similar problem, but it wasn't any help — Google was actually now returning the AMP version of the page mentioned in the question, and I didn't have enough reputation points on the site to be able to leave a comment asking the original poster if they'd changed anything to fix the problem.

I tried asking Google if it knew about the AMP versions of my pages using the ampUrls.batchGet API as described here. It didn't:

% curl -i -s -k -X POST \
  -H "Content-Type: application/json" \
  -H "X-Goog-Api-Key: [my key]" \
  -d "{urls: ['https://www.erat.org/pullups.html']}" \
  "https://acceleratedmobilepageurl.googleapis.com/v1/ampUrls:batchGet"
HTTP/1.1 200 OK
...

{
  "urlErrors": [
    {
      "errorCode": "NO_AMP_URL",
      "errorMessage": "No AMP URL for the request URL.",
      "originalUrl": "https://www.erat.org/pullups.html"
    }
  ]
}

When I searched for someone else's page, like https://ampbyexample.com, the API would return the AMP (in this case, same as the canonical) and CDN versions of the page:

{
  "ampUrls": [
    {
      "originalUrl": "https://ampbyexample.com",
      "ampUrl": "https://ampbyexample.com/",
      "cdnAmpUrl": "https://cdn.ampproject.org/c/s/ampbyexample.com/"
    }
  ]
}

I compared my site's pages to other AMP pages being indexed by Google and quickly noticed one difference: in my non-AMP pages, the <link rel="amphtml" ...> element was appearing pretty far down the page, often more than 1300 bytes from the start of the file. (The huge JSON-LD structured data blob was mostly to blame for this.) Other pages put it near the top. I remembered that <meta charset> elements are supposed to appear within the first 1024 bytes of a HTML file, and it seemed totally plausible that Google could decide to also only scan the first kilobyte of a file looking for <link rel="amphtml">.

So, I updated my canonical pages so the <link rel="amphtml"> tags appeared as close to the top as possible, just after <meta charset>:

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="utf-8"/>
  <link rel="amphtml" href="https://www.erat.org/pullups.amp.html">
  ...

By the next morning, around eight hours later, the API was reporting that it knew about the AMP and CDN versions of my pages:

{
  "ampUrls": [
    {
      "originalUrl": "https://www.erat.org/pullups.html",
      "ampUrl": "https://www.erat.org/pullups.amp.html",
      "cdnAmpUrl": "https://cdn.ampproject.org/c/s/www.erat.org/pullups.amp.html"
    }
  ]
}

And just a few hours after that, my AMP pages were also getting returned for mobile searches, so I answered the StackOverflow question and filed an AMP bug requesting that this be clarified in the docs.

Addendum: More indexing problems (2016-11-19)

After a month or so, I was still excited enough about AMP that I decided to convert eatnum.com, another site that I made, to use it. I could probably write another long blog post about the process, but the upshot was that I ended up with dynamically-generated, canonical (i.e. no non-AMP versions) AMP pages at URLs like https://eatnum.com/19120.

More than a week after updating the site to use AMP, the Search Console was still reporting zero AMP pages, though. It's basically just a bunch of machine-generated pages filled with numbers (*cough* although at least not covered with ads and Flash like other nutrition data websites), so I could accept Google choosing to take a while to recrawl it, but I noticed that the ampUrls.batchGet API also reported URL_IS_INVALID_AMP (Request URL is an invalid AMP URL.) for all of the AMP pages. This was weird, since both in-browser validation and the online validator said it was fine.

I eventually figured out that the problem was a hack that I'd added to make the non-interactive AMP versions of these pages link to the dynamic site: I'd placed <a> tags around an <input> element with hopes of replacing the static page with a dynamic version of the same view when the user clicked on the input field to search for something else. This worked as I'd intended in Chrome, but it's against the rules — the W3C's spec says that <a> can't contain interactive elements. After getting rid of the linked-<input> hack, the API suddenly became happy with the AMP pages.

If I'd been using an HTML5 validator, I would've caught this earlier, but the two HTML5 validators that I'm aware of (the W3C's and Validator.nu's) display so many errors related to AMP's custom elements and attributes that they're unusable for AMP pages unless I bolt on a bunch of filtering to ignore the expected problems. Without doing that, I'm not sure how to prevent this from happening again given how lenient the AMP validator seems to be, though.

A bit later: While I haven't seen it mentioned in many places, there's also a Google-supplied AMP validator. I'm not sure whether it's stricter than the AMP project's validator, but I wouldn't be surprised if it includes some extra checks that can help when trying to figure out why Google isn't indexing the AMP version of a page. However, even with it reporting that my pages are "eligible for AMP search features in Google search results" and "[have] valid structured data", the search console itself still reports zero AMP pages for eatnum.com. So... *shrug*.