Using Amazon S3 in Enterprise GIS

If you are not an Amazon Web Services customer, you should be. If you are and you've spent any amount of time on your AWS Management Console homepage, you have been overwhelmed by the choices. At the time of writing, there are 25 different services to choose from. I'm not talking about buying books here, I mean Elastic Transcoders, SNS, SES, Beanstalks, Clouds, Glaicers, Route 53, Gateways, Pipelines and something new called Redshift (sounds like the B-Movie version of Red Dawn). I am just going to focus on one; 'Amazon S3'.

From Amazon:

Amazon Simple Storage Service (Amazon S3) is storage for the internet. You can use Amazon S3 to store and retrieve any amount of data at any time, from anywhere on the web.

In Layman's terms, it's like a My Documents folder on the internet. But it's not like Dropbox, Box, SkyDrive or any other cloud-based storage you're thinking of. The closest I can come up with is FTP without the TP. It's an exposed webserver folder,... but better. I'm not making much sense here. 

These folders are called, wait for it... Buckets. I guess maybe the term 'Folder' is copyrighted by Microsoft. You can put anything you want in these Buckets. In-Terms of GIS data, this is NOT a really good place for dynamic information. What is the most static, thick, heavy, bandwidth-hogging file you can think of? Pictures. We've found Public Works guys love to take photos. Water main breaks, electrical poles, as-builts, hand-drawn maps they're all there.  For one customer we put references to these images as points on a map and the filename is within the attribute table. Now, how to get to the images when we're on a webmap in or as a hyperlink within ArcGIS Desktop...? S3 Buckets.

So far, for two separate clients (soon to be 3), we have created a public facing bucket and uploaded their 100s and 1,000s of photos. You could you do this with Dropbox? Not really. With their 2-3Gb limit, have fun. Maybe you could follow this handy tutorial that shows you how to link ONE-FILE-AT-A-TIME in to your GIS data, and then you can get a lobotomy. Ok, I know Flickr has their 1 TB limit, but still the problem with all those cloud photo storage sites is unique, randomized Tokens. It is a security measure. know how that  SmugMug album you share with grandma has some random "kjsadf87i23ujasoiX30LK" value in the URL? That's a random token. Most every time, if you actually figure out how to link directly to the image, that Token URL is different for EACH picture. Here it's "_zpse59ecc34" and here it's "_zps6ed573a5", same Album, different Token. If you're trying to use an attribute for a picture location, this will break your automation process in a quick-hurry.  

AWS S3 is the best solution We've found. It's there, it's always on, there are uploading tools, there's no login (unless you want security, then you can have it) and best of all; the URL You create links directly to the file, and the prefix does NOT change.  

I'm just scratching the surface here. AWS is loaded with goodies that the geospatial industry has not even attempted. I'm convinced that within 7-10 years most businesses and some local governments, will never have to deal with an on-premise server again.  

 **Just to melt your face, the Buckets also can serve up webpages. This Esri FlexViewer app is being served on an S3 bucket (The data, however, is on our EC2 instance). Your next obvious question: Could you create an application in and upload it to a S3 bucket without the need of any webserver. Yes. Yes you can. Call us.