Overview Information Access and Communication Networks in the z Poor Man’s Broadband Developing-world – Poor Man’s Cache – Packet Containment z TEK Search Umar Saif z Inverse of Cellular LUMS, Connections [email protected] | [email protected] z Teleputer (Time permitting)

Motivation Internet in Pakistan

Developed Developing z Facts of life in the developing world

World e World

d – Expensive International 2 MB Internet i 2 MB Internet Connection Connection v – No real peering points < $40 i > $4000

D – Internet used over dialup Bulk Data Bulk Data Transfer on Transfer on • Poor “Scratch card” provisioning the Internet l the Internet > 70%

a < 15%

t i Average End- Average End-

user g user i Bandwidth via Bandwidth

ISP D via ISP > 100 kb/sec < 10 kb/sec

Internet in Pakistan How I Stumbled Upon this?

z Average Dialup Bandwidth z “Good research solves real problems in a – Less than 10 kb/sec practical way” z Almost Never Used for – Started last year when I wanted to exchange a 3.5 MB PDF file with my dad – Exchanging – Two laptops sitting next to each other – Disseminating – No way to exchange data if you don’t have portable – Accessing storage! …. Content larger than a couple of hundred • We actually went our and bought a CDR to exchange kilobyes data….

1 Problem Solution <10kb/sec Internet Bypass the Internet when exchanging large

Internet

~ 56kb/sec ~ 56kb/sec

Not a Last Mile Problem

Email Attachments Disruptive Technology z Time to exchange a 3.5 MB file on the z Of course Internet also started as an Internet ~ 1 hours (16 Kb/sec) overlay over the phone lines – 30 mins upload and download z A new kind of Internet – Assuming no disconnections z Reminiscent of Pre-Internet days z Time now (40 kb/sec) – FidoNet – 12 mins!! – UUCP

Dialup P2P-ISP Interleaving

Why is this Practical? Key Idea: Use Internet as a directory service, not as digital pipe

Internet z Phone bills are becoming “Flat” – Rs 199/month -- free nation-wide calls z But cannot always connect to the server z As long as one can identify a “close-by” host, “broadband access” is free Line-speed (~40kb/s) dialup z P2P systems already follow a similar connections Peer-to-peer model dialup connections – Incentive-based BitTorrent Interleaving of ISP- Dialup P2P dialup Underlay connections

2 Dialup BitTorent Other Challenges

BitTorrent in a z Overhead of Peer connections: ~30 sequential mode sec z Offline Block Discovery z Last Block Problem z Flash Crowds (Backoff for congestion control) z Budget-based Download

Peer Connection Overhead Offline Block Discovery

DitTorrent(Greedy) Worst Case(N-C alls) Best Case(lgN-C alls) A 2 3 7 9

1 3 6 8 B

Virally Spread 6 Knowledge of File Blocks A 1 3 6 8 B 2 3 7 9

Offline Block Discovery Last Block Problem

Downloading Time • Grab rare blocks first Online Block Discovery Offline Block Discovery • Favor those who will finish at the end of the connection 100.00

90.00 Globally Rarest First(GRF) Scheme

80.00 3500.00 70.00 3000.00 60.00 2500.00 50.00

40.00 2000.00

Time Taken(min)Time 30.00 1500.00

20.00 1000.00

10.00 500.00 0.00 0.00 5 5 5 5 .5 0.00 1000.00 2000.00 3000.00 4000.00 5000.00 6000.00 0. 1.5 2. 3.5 4. 5.5 6. 7.5 8 9.5 File Size(KB) File Size(MB)

3 Flash crowds Budget-based Download

• Each busy-tone costs ~10 secs Incomplete Nodes (%age) Data Acquired(% age) • Wait between calls (nominal) • Back-off in times of congestion (busy tones)

Optimum Choking Interval

WBC=1 WBC=25

1600.00 1400.00 1200.00 1000.00 800.00 600.00

Time Taken(min) Time 400.00 200.00 0.00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Choking Interval

Three Evolving Applications Can we do the same on the “Internet” z P2P file-sharing z Contain packets within the developing- world z Web-browsing – Routing paths are screwed up z Large attachments • “All roads lead to Rome” i.e. transit out of the country z Low cache hit-rates – Too many small ISPs, no sharing of cached content http://dittorrent.sourceforge.net – Misconfigured “Proxies” – Hit rates < 30% (instead of >60%) z Poor DNS support

Work-in-progress ChoupalLink z Indirect Routing – Force routing paths by vectoring messages through intermediate nodes – Initial results show improvement across all Inverse multiplexing High-bandwidth Virtual Channel traffic metrics over GSM/GPRS/ z An ISP-independent distributed cache – Similar to CoralCDN z An ISP-independent DNS Inverse Multiplexing over cellular connections

4 TEK: Time Equals Knowledge TEK z Web Search for Low-bandwidth, z Internet in the developing-world intermittently connected users – Expensive – Intermittent z One of the first examples of mainstream ICTD research ~circa z An Email-based Internet Search Facility 2000 – Asynchronous Dialup model – Search optimized for bandwidth and latency z Renewed interest with DTNs coming rather than speed in vogue – Heavy client-side caching

TEK TEK Server

z Remove Duplicate Content z Cluster results z Distinguish Content from links z Remove images z Remove background code z Compress results

Rationale Our Teleputer z Lower operational costs – Caching vs internet download – ISP-host connection • Reliable • Higher Bandwdith • Cheaper: email-only account – Better utilization of Internet Connection

5 Multi-user devices Teleputer

z Zero-configuration z Text-free Interface z Sensor-actuator z Cell-phone integrated z Shared Computing z Server-style processing

Teleputer Sensors Teleputer Operation

Network connected to teleputer device

Workstation Workstation Workstation

Television Workstation Workstation Workstation Mobile Device

Workstation Workstation Workstation Laptop computer

Built on MOTE Based Teleputer Device

Berkeley MOTE

Thank you!

[email protected]

http://www.dritte.org

6