This Slide Intentionally Left Blank What it is, and what it is NOT RIP Ray Tomlinson Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can.

~ Jamie Zawinski (JWZ) The Law of Software Envelopment THE SESSIONS: ACT ONE

• My Greatest Flaw is that I am Slow

• My organization, the School District of Escambia County (ECSD), is coming to The Cloud.

• We want to understand it before we make it integral to our operations.

• Thus, four sessions on Google Apps.

• Each builds on the next.

• Some observations might be trite or even inaccurate.

• Call me out on those.

• Our approach begins outside an older application, and moves inward. WHERE ECSD IS NOW

• Management believes maintaining 3 separate e-mail systems (GroupWise, GMail, O365 Outlook) is silly.

• Pick the simplest to maintain and go with that.

• Gmail is required by GAPS which is required by .

• Most users will not notice missing features.

• Management is generally trustworthy and intelligent.

• Power users want us to move to Gmail.

• Lots of “When are we moving?” commentary.

• But quite a bit of the reverse as well, from other types of power users.

• Especially calendar folks.

• I have a natural suspicion of closed, vendor-locked systems (Thanks, Pearson!)

• Google bristles and points to Vault when we say that.

• Those of us maintaining e-mail system want to make an informed decision.

• I’ve used Gmail for years, but what do I know about it?

• Putting aside transition and licensing costs, what is the “Right” thing? INQUIRY GROUND RULES

• The Internet is full of Guides to Gmail with helpful tips.

• And yet Gmail is really a moving target from an interface perspective

• Fundamental features are another story.

• How useful is the past in our rapidly iterating technological space?

• If you use the term “digital immigrant”, that’s a paddlin’.

• “Millennial” — you know THAT’s a paddlin’.

• The has been narrated in a number of places

• Including Wikipedia

• I prefer “Founders at Work” by Jessica Livingstone

• Interview with Paul Buchheit

• Recall the lessons of “The Power Broker”

• Google is not going to publish deep implementation details.

• Let us not let Kremlinology carry us away WHY DO USERS PREFER GMAIL?

• The interface feels futuristic.

was new when it premiered.

• More discussion down the line.

• Their most common tasks are within easy reach.

• It is highly tolerant of “messy” user behavior.

• Benefit of the doubt goes to the user, not the machine.

• This is very hard to do.

• Easy to access from any device.

• Nothing is ever lost.

• It learns.

• Google has done a great job of selling itself as a bunch of geniuses inventing the future.

• Fair enough. Every one has an image to maintain.

• Is German engineering really that much better than French?

• The guys working on the self-driving cars are not working on Gmail. IS GMAIL E-MAIL?

• Question is not meant to be “provocative”.

• Answer invites exploration.

• Is E-mail based on its traditional implementation?

• Or its purpose?

• And if Gmail fails both of those tests, is that problematic? TRADITIONAL E-MAIL

• One sender, one or more recipients.

• Private, not public.

• Otherwise it would be Usenet / NNTP.

• “Store and Forward”

• Shuttling text files across a network.

stored on a disk.

• First retrieved on the mail server machine

• Which might serve lots of purposes.

• PINE, MUTT, EMACS (see JWZ quote)

• Later, retrieved via network protocols.

• Support for attachments and rich text via MIME.

• The network is built on trust.

• Until the Long September.

• September 1992 had 30 days.

• September 1993 has 8,223 days and counting. EMAIL STANDARDS

• SMTP - Sending and Transfers.

• POP - Retrieval to local client.

• IMAP - Reading from server.

• MIME - Content encoding.

• MX Records - DNS Routing for E-Mail.

• Headers etc. defined by RFC, but mainly by folklore.

• Storage: Berkeley MBOX format, spool files.

• LDAP: Original Directory.

• CardDAV: Replacement Directory.

• CALDAV: Calendaring.

• ActiveSync: Mobile Collaboration. THE TRAGEDY OF EMAIL

• The foundational standards of e-mail are uniformly terrible, and we have spent decades trying to build something sensible upon them.

• SMTP was too naive, and so we added relay restrictions, authentication, encryption, and finally SPF trickery.

• POP was limited to copy-then-delete from a single directory.

• So we invented IMAP for multi-folder, multi-client.

• And did anyone every implement it well on the server or client?

• MX Records had to be supplemented by other methods of domain verification.

• The very existence of anti-spam technologies demonstrates poor design.

• And let us not even speak of strategies for generating headers, threading, quoting, summarizing (all sadly documented at http://jwz.org/hacks with discussions regarding Netscape Mail v2 and v3)

• Until very late in the game there was no comprehensive standard for calendaring (CalDAV).

• So we invented GroupWare … “GROUPWARE BAD”

• Internet E-mail was the world of those who had Internet access.

• Local networks developed their own messaging systems independent of it.

• Support for most standards initially bolted on, and always uneasy.

• Exchange Nobody has ever been able to adequately explain Exchange architecture to me.

• At least it gave us ActiveSync.

• Lotus Notes A synchronization engine that generated e-mail, calendaring, workflow apps, websites, CMS, and much more.

• Taco Bell Syndrome - “It’s all the same, it’s just the way you fold it.”

• GroupWise I have an entire presentation on the Architectural History of GroupWise.

• The past glory of GroupWise is its data store.

• The future glory might be extending that via APIs.

• Apple Mail Designed elmx as a sort of “Super MBOX” format.

• Flags designed for Apple Mail client.

• Optimized for Spotlight metadata and searching. WHERE IS FUNCTIONALITY IMPLEMENTED?

• We might think of features implemented on a quadrant

• In or Above the Data Layer

• Through Standard E-mail Mechanism or Through Proprietary Tech

• Example: Checking whether a recipient has read an e-mail.

• GroupWise implements functionality as integral part of the data layer using proprietary calls.

• Lotus Notes implements functionality through “read receipts” as a quasi-standard feature interpreted by recipient’s client.

• Some clients store read receipts with message, some mark them as a separate, standalone message. ENTER GMAIL

• Gmail was initially built upon two existing Google projects

(a web layer above usenet)

• Free text searching (applied to general datasets)

• All that code has definitely been rewritten.

• Goal was to find ways of helping users with massive mailboxes manage their E-mail.

• Implication is that the E-Mail data storage began as standard inboxes (maybe mbox format).

• But it could not stay that way.

• The basic approach lives on in “Gmailify”

• Had to be web-based because Google is a web-oriented company. SEARCH CAUSES STORAGE PROBLEMS

• Typical approach to e-mail search is to do a straightforward search through E-Mail data store based on filter and keywords.

• Indexes can be built ahead of time to speed searches based on structured data (subject, addresses, time).

• Google Web Search crawls the web server farms to build search indexes.

• Only once the information is brought into the index through crawl retrieval can the data be searched.

• Early Gmail users expected mail to be searchable as soon as it entered inbox.

• But it could not be, because building the search index takes time once a message is incorporated into the data set.

• Moreover, the model for serving web content requires distribution and redundancy of data across multiple web servers.

• How do we guarantee that a user has full access to their e-mail inbox with current data? SEARCH SOLUTIONS TRANSFORMED GMAIL

• New methods of searching had to be built just for E-Mail

• “Google” was too slow!

• A Guess: Speculative indexes and other aggressive, resource-intensive data sets have to be generated ahead of time so that can be broken up and slotted into them as soon as they are received.

• The back end data for Gmail must be optimized for quick and continuous processing by very efficient algorithms.

• As such it must operate on almost the entire Gmail data set continuously.

• A Guess: This means that the entire data set is one, and partitioning the data sets beyond single users is actually kind of hard.

• Gmail is Groupware, but at a huge and simplified level.

• Google is the group, and the user is the user.

• All other relationships are shims.

• German Gmail (Googlemail) and Google Apps domain e-mail demonstrate this paradigm.

• All doors lead to Gmail

• The price of all this is that it is hard to separate educational data from other data.

• Privacy lawsuits. FREE CANDY FOR EVERYONE!

• Unlimited storage is part of the model, not merely a selling point.

• You are already holding multiples of the data for indexes, backups, server farm distribution etc.

• A Guess: Cutting the data set down is not desired

• Trimming is another (inefficient) operation.

• The algorithms learn better on large data sets.

• Hence, it is hard to delete e-mails, most are just archived

• Although,deletion has to exist for strong reasons. GMAIL STORAGE

• The data does not represent anything one might recognize as traditionally structured e- mail data

• Nor the esoteric formats supported by Groupware products.

• Multiple abstraction layers in front of the user are required.

• Even when exporting mail, the data presented to the user is a complex composite.

• The same is true for protocol access.

• Labels are not folders.

• Labels are additive metadata.

• Folders imply absolute data organization, which does not exist.

• Threads are not generated after the fact, but are a fundamental structure.

• A single message is just a very short thread.

• Original interface was too slow, even beyond searching.

• Entire page had to reload when retrieving information.

• Response was to do pioneering work with AJAX.

• Pages update pieces of data without reloading entire DOM.

• Speed depends on server/network speed and client JavaScript speed.

• This is now the world we live in.

• When we discuss Classroom we will see endgame.

• Primary Gmail interface is the web.

• The fundamental organizational metaphor is search.

• It is not possible to reorder messages by sender as in a traditional client.

• One filters the search instead.

• Or dumps the data to a traditional client.

• Secondary Gmail interface is now Google’s own applications (Inbox etc.)

• Starting point is web interface, and builds from there.

• We could spend a lot of time figuring out when features are implemented in standards and by more direct means (Quadrant)

• The third interface is standards, but before we discuss those again …. INTERFACE CHANGES

• Casual users despise interface refreshes for any web service.

• But their memories are short, and penalties are low.

• Interface changes are necessary.

• Improvements to HTML/CSS/JavaScript make richer functionality possible, faster (more responsive), and more efficient.

• Pushed to a great extent by changes Google itself makes via the implementation of Chrome.

• The ability to improve the interface in absolute terms is imperative for web applications in a universe where native mobile and desktop applications and new interfaces such as VR and Hololens are in active development.

• Fashion plays a role because style subconsciously ages.

• Look at a desktop UI from a decade ago.

• Google’s brand is the future.

• Think of Gmail interface as a car model year.

• Google’s corporate priorities play a role.

• Google+

• Nothing else needs to be said. THE THIRD AND FOURTH INTERFACE

• Third Interface: The old E-Mail “Standards”

• Google’s implementation of them is not perfect

• Especially IMAP

• But they were terrible anyway

• And they are provided as a separate compatibility layer outside of the true nature of the product.

• Why did Google implement and then drop Activesync?

• Activesync only useful until CalDAV

• Then not worth the licensing cost.

• Lives on for corporate and education customers.

• Very strict verification (SMTP cli etc.) because of size of target.

• Fourth Interface: The API

• Useful for leveraging pieces of Gmail in other applications.

• A clue: Google advises against trying to build a mail client with it.

• We will discuss this in depth in Act 3. MESSY USERS

• Free-text searching actually encourages allowing “messy” user behavior because free text search is itself messy.

• Searches for items “almost like” what you want become possible.

• As “we” learn more about users:

• Great implications for spam and junk processing.

• And for detecting other types of useful information in their messages — plane tickets, contacts, implied appointments

• And of course “we” need to sell them ads.

• But can we predict their behavior?

• “Smart Reply” in the Inbox application seems to think so. UNDISCIPLINED USERS

• Self-Discipline and concentration are limited resources.

• Like muscles, they can be built but eventually exhaust.

• As IT workers, we expect our users to use the systems properly.

• But is this the best use of their discipline and concentration given their goals and the goals of our organization?

• We are here to educate children and create a generation that benefits themselves and our society.

• Discipline in communications is part of that, but which part?

• Should we just let our users live in their own mess?

• If they never delete or organize mail, how does that affect our mission? DON’T WE LIKE OUR USERS?

• Why call them messy and undisciplined?

• Is the job of IT to say “no” or “yes”?

• Both?

• When?

• Aren’t we trying to support personalized learning?

• If so, won’t dictating their access get in the way of that?

• But if everyone is personalized all the time, how do we bring folks together for common goals?

• How do we get access to their data to direct group operations?

• Google provides E-Mail groups, but no domain address book.

• How do we provide legally-mandated oversight?

• In principle Google stands behind this, but how does it work in practice?

• Doesn’t Google like us?

• Maybe they like us.

• But they like our users more.

• The data set gets in the way. HOW LONG CAN GMAIL BE FREE?

• Google is giving it away for free to personal users and education.

• Google remains profitable with no signs that will change.

• Gmail was released to the public 12 years ago.

• The programming costs are amortized over time.

• Hardware costs in per-user terms will fall.

• The user base will not rise exponentially in a short time period.

• Ergo, Gmail can be free for the indefinite future.

• Google’s confidence in that is reflected in its guarantees. A SENIOR IN A LOGAN’S RUN WORLD

• Gmail’s Groupware functionality limited to standards

• In part because Google thinks it has better answers in other products for the questions traditionally posed by extended Groupware.

• At what part of the quadrant do Gmail’s newer features operate?

• Messages from appear to come from a no-reply user.

• But the interface detects and displays a direct link to the document.

• Google Inbox features seem to blur the lines further.

• Again, definitely an area for further research. WHAT GMAIL IS

• A web application offering search through e-mail.

• A single enormous data set

• With an E-Mail abstraction layer.

• That implements standard E-Mail-adjacent protocols.

• A mature product. WHAT GMAIL IS NOT

• File-System Based Data

• Organization-Based Collaboration

• More Than E-mail, Contacts, and Calendaring

• Gmail is Done INTERMISSION

• Check out Jamie Zawinski’s thoughts on e-mail: http://jwz.org/doc http://jwz.org/hacks “Coders at Work” by Peter Seibel

• In the next session, we will see a further iteration of the massive data set that began with Gmail:

• Classroom

• Johnnie Odom of the School District of Escambia County has been your host.

• Questions?

• Thank You