Open Data Must be Distributed via HTTPS

18 November 2015

HTTP is being replaced by HTTPS. The default has changed. The newly-released HTTP/2 standard is being implemented almost exclusively over HTTPS, which will leave HTTP as a legacy standard. Firefox and Chrome are both deprecating HTTP.

Government is following suit. With M-15-13, the White House has declared that “all browsing activity should be considered private and sensitive” (their emphasis), including APIs. Federal agencies have a deadline of December 31, 2016 to introduce HTTPS and eliminate HTTP.

Following the lead of the federal government, it’s time to make HTTPS the new standard for data published by state and local governments, too. This is for two main reasons:

  1. Using HTTPS protects the privacy of people accessing that data, e.g., a woman downloading a list of shelters for abused women in her area.
  2. HTTPS ensures that data cannot be altered in transit. This is not a hypothetical concern: proxies, including ISPs and public WiFi hotspots, sometimes change data before delivering it to the client. Protecting data with HTTPS makes it impossible to alter.

Also, data integrity is an important issue to some folks in government, who worry that people wind up with altered, inaccurate copies of data, and so they object to publishing data at all if there’s no verification method. HTTPS addresses this by ensuring that the data that is published on the internet is the data that people receive when they download it.

Vendors of data repository software, especially those who provide hosted solutions, can make a strong impact here. They’re in a position to add this feature for their clients, at a trivial cost. But for data outside of a repository, scattered around various websites, this poses a challenge. But it’s also a good incentive to for governments to centralize those scattered datasets on a centralized, HTTPS-enabled data repository. The inexorable move to HTTPS for all internet traffic means that all straggling data will eventually be protected by HTTPS anyway—there is no scenario in which it’s forever transmitted without encryption.

Let’s not wait for open data to be dragged into the future by default. Instead, the sector should take the lead, assuring data publishers and consumers alike that data is confidential and unaltered.