Scalpel: Optimizing Query Streams Using Semantic Prefetching
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Minimax Redundancy for Markov Chains with Large State Space
Minimax redundancy for Markov chains with large state space Kedar Shriram Tatwawadi, Jiantao Jiao, Tsachy Weissman May 8, 2018 Abstract For any Markov source, there exist universal codes whose normalized codelength approaches the Shannon limit asymptotically as the number of samples goes to infinity. This paper inves- tigates how fast the gap between the normalized codelength of the “best” universal compressor and the Shannon limit (i.e. the compression redundancy) vanishes non-asymptotically in terms of the alphabet size and mixing time of the Markov source. We show that, for Markov sources (2+c) whose relaxation time is at least 1 + √k , where k is the state space size (and c> 0 is a con- stant), the phase transition for the number of samples required to achieve vanishing compression redundancy is precisely Θ(k2). 1 Introduction For any data source that can be modeled as a stationary ergodic stochastic process, it is well known in the literature of universal compression that there exist compression algorithms without any knowledge of the source distribution, such that its performance can approach the fundamental limit of the source, also known as the Shannon entropy, as the number of observations tends to infinity. The existence of universal data compressors has spurred a huge wave of research around it. A large fraction of practical lossless compressors are based on the Lempel–Ziv algorithms [ZL77, ZL78] and their variants, and the normalized codelength of a universal source code is also widely used to measure the compressibility of the source, which is based on the idea that the normalized codelength is “close” to the true entropy rate given a moderate number of samples. -
Digital Communication Systems 2.2 Optimal Source Coding
Digital Communication Systems EES 452 Asst. Prof. Dr. Prapun Suksompong [email protected] 2. Source Coding 2.2 Optimal Source Coding: Huffman Coding: Origin, Recipe, MATLAB Implementation 1 Examples of Prefix Codes Nonsingular Fixed-Length Code Shannon–Fano code Huffman Code 2 Prof. Robert Fano (1917-2016) Shannon Award (1976 ) Shannon–Fano Code Proposed in Shannon’s “A Mathematical Theory of Communication” in 1948 The method was attributed to Fano, who later published it as a technical report. Fano, R.M. (1949). “The transmission of information”. Technical Report No. 65. Cambridge (Mass.), USA: Research Laboratory of Electronics at MIT. Should not be confused with Shannon coding, the coding method used to prove Shannon's noiseless coding theorem, or with Shannon–Fano–Elias coding (also known as Elias coding), the precursor to arithmetic coding. 3 Claude E. Shannon Award Claude E. Shannon (1972) Elwyn R. Berlekamp (1993) Sergio Verdu (2007) David S. Slepian (1974) Aaron D. Wyner (1994) Robert M. Gray (2008) Robert M. Fano (1976) G. David Forney, Jr. (1995) Jorma Rissanen (2009) Peter Elias (1977) Imre Csiszár (1996) Te Sun Han (2010) Mark S. Pinsker (1978) Jacob Ziv (1997) Shlomo Shamai (Shitz) (2011) Jacob Wolfowitz (1979) Neil J. A. Sloane (1998) Abbas El Gamal (2012) W. Wesley Peterson (1981) Tadao Kasami (1999) Katalin Marton (2013) Irving S. Reed (1982) Thomas Kailath (2000) János Körner (2014) Robert G. Gallager (1983) Jack KeilWolf (2001) Arthur Robert Calderbank (2015) Solomon W. Golomb (1985) Toby Berger (2002) Alexander S. Holevo (2016) William L. Root (1986) Lloyd R. Welch (2003) David Tse (2017) James L. -
Principles of Communications ECS 332
Principles of Communications ECS 332 Asst. Prof. Dr. Prapun Suksompong (ผศ.ดร.ประพันธ ์ สขสมปองุ ) [email protected] 1. Intro to Communication Systems Office Hours: Check Google Calendar on the course website. Dr.Prapun’s Office: 6th floor of Sirindhralai building, 1 BKD 2 Remark 1 If the downloaded file crashed your device/browser, try another one posted on the course website: 3 Remark 2 There is also three more sections from the Appendices of the lecture notes: 4 Shannon's insight 5 “The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.” Shannon, Claude. A Mathematical Theory Of Communication. (1948) 6 Shannon: Father of the Info. Age Documentary Co-produced by the Jacobs School, UCSD- TV, and the California Institute for Telecommunic ations and Information Technology 7 [http://www.uctv.tv/shows/Claude-Shannon-Father-of-the-Information-Age-6090] [http://www.youtube.com/watch?v=z2Whj_nL-x8] C. E. Shannon (1916-2001) Hello. I'm Claude Shannon a mathematician here at the Bell Telephone laboratories He didn't create the compact disc, the fax machine, digital wireless telephones Or mp3 files, but in 1948 Claude Shannon paved the way for all of them with the Basic theory underlying digital communications and storage he called it 8 information theory. C. E. Shannon (1916-2001) 9 https://www.youtube.com/watch?v=47ag2sXRDeU C. E. Shannon (1916-2001) One of the most influential minds of the 20th century yet when he died on February 24, 2001, Shannon was virtually unknown to the public at large 10 C. -
IEEE Information Theory Society Newsletter
IEEE Information Theory Society Newsletter Vol. 59, No. 3, September 2009 Editor: Tracey Ho ISSN 1059-2362 Optimal Estimation XXX Shannon lecture, presented in 2009 in Seoul, South Korea (extended abstract) J. Rissanen 1 Prologue 2 Modeling Problem The fi rst quest for optimal estimation by Fisher, [2], Cramer, The modeling problem begins with a set of observed data 5 5 5 c 6 Rao and others, [1], dates back to over half a century and has Y yt:t 1, 2, , n , often together with other so-called 5 51 c26 changed remarkably little. The covariance of the estimated pa- explanatory data Y, X yt, x1,t, x2,t, . The objective is rameters was taken as the quality measure of estimators, for to learn properties in Y expressed by a set of distributions which the main result, the Cramer-Rao inequality, sets a lower as models bound. It is reached by the maximum likelihood (ML) estima- 5 1 u 26 tor for a restricted subclass of models and asymptotically for f Y|Xs; , s . a wider class. The covariance, which is just one property of u5u c u models, is too weak a measure to permit extension to estima- Here s is a structure parameter and 1, , k1s2 real- valued tion of the number of parameters, which is handled by various parameters, whose number depends on the structure. The ad hoc criteria too numerous to list here. structure is simply a subset of the models. Typically it is used to indicate the most important variables in X. (The traditional Soon after I had studied Shannon’s formal defi nition of in- name for the set of models is ‘likelihood function’ although no formation in random variables and his other remarkable per- such concept exists in probability theory.) formance bounds for communication, [4], I wanted to apply them to other fi elds – in particular to estimation and statis- The most important problem is the selection of the model tics in general. -
Itnl0908.Pdf
itNL0908.qxd 8/13/08 8:34 AM Page 1 IEEE Information Theory Society Newsletter Vol. 58, No. 3, September 2008 Editor: Daniela Tuninetti ISSN 1059-2362 Annual IT Awards Announced The principal annual awards of the Information Theory Society of the U.S. National Academy of Engineering (2007 to date). He were announced at the Toronto ISIT. The 2009 Shannon Award will be General Co-Chair of the 2009 ISIT in Seoul. goes to Jorma Rissanen. The 2008 Wyner Award goes to Vincent Poor. The winners of the 2008 IT Paper Award are two 2006 IT The Information Theory Society Paper Award is given annually Transactions papers on compressed sensing by David Donoho to an outstanding publication in the fields of interest to the and by Emmanuel Candes and Terence Tao, respectively (with Society appearing anywhere during the preceding two calendar acknowledgment of an earlier paper by Candes, Romberg and years. The winners of the 2008 award are "Compressed sensing," Tao). The winner of the 2008 Joint IT/ComSoc Paper Award is a by David Donoho, which appeared in the April 2006 IEEE Communications Transactions paper on Accumulate-Repeat- Transactions on Information Theory, and "Near-optimal signal Accumulate Codes by Aliazam Abbasfar, Dariush Divsalar and recovery from random projections: universal encoding strate- Kung Yao. Three student authors of ISIT papers received ISIT gies," by Emmanuel Candes and Terence Tao, which appeared Student Paper Awards: Paul Cuff of Stanford, Satish Korada of in the December 2006 Transactions. These two overlapping EPFL, and Yury Polyanskiy of Princeton. Finally, the 2008 papers are the foundations of the exciting new field of com- Chapter of the Year Award was presented to the Kitchener- pressed sensing. -
Evaluation of Image Compression Algorithms for Electronic Shelf Labels
UPTEC IT 09 018 Examensarbete 30 hp December 2009 Evaluation of Image Compression Algorithms for Electronic Shelf Labels Robin Kuivinen Abstract Evaluation of Image Compression Algorithms for Electronic Shelf Labels Robin Kuivinen Teknisk- naturvetenskaplig fakultet UTH-enheten An advantageous innovation for retail stores is the ESL system, which consists of many electronic units, called labels, showing product and price information to Besöksadress: customers on small displays. Such a system offers, among other things, efficient price Ångströmlaboratoriet Lägerhyddsvägen 1 updates that implies lower costs by reducing man-hours and paper volumes. Data Hus 4, Plan 0 transfered to a label can be compressed in order to minimize the data size and thus lowering the updating time and the energy consumption of a label. Postadress: Box 536 751 21 Uppsala In this study, twelve lossless compression prototypes were implemented and evaluated, using a corpus set of bi-level images, with respect to compressibility, Telefon: encoding and decoding time. Six of those prototypes were subject to case studies of 018 – 471 30 03 real ESL data and further studies of memory consumption. Telefax: 018 – 471 30 00 The results showed that the low-precision arithmetic coder LowPac, which uses a context-based probability model, was one of the best prototypes over all data sets in Hemsida: terms of compressibility and it was also the most memory-efficient prototype. http://www.teknat.uu.se/student Using the corpus set, LowPac obtained encoding and decoding times of 2 times longer than one of the fastest prototypes, but with 108% improvement of compressibility, on average. Handledare: Nils Hulth Ämnesgranskare: Cris Luengo Examinator: Anders Jansson ISSN: 1401-5749, UPTEC IT09 018 Tryckt av: Reprocentralen ITC Sammanfattning En butiksägare inom detaljhandeln står inför en tidsödande process när nya priser på varor ska märkas upp. -
Ieee-Level Awards
IEEE-LEVEL AWARDS The IEEE currently bestows a Medal of Honor, fifteen Medals, thirty-three Technical Field Awards, two IEEE Service Awards, two Corporate Recognitions, two Prize Paper Awards, Honorary Memberships, one Scholarship, one Fellowship, and a Staff Award. The awards and their past recipients are listed below. Citations are available via the “Award Recipients with Citations” links within the information below. Nomination information for each award can be found by visiting the IEEE Awards Web page www.ieee.org/awards or by clicking on the award names below. Links are also available via the Recipient/Citation documents. MEDAL OF HONOR Ernst A. Guillemin 1961 Edward V. Appleton 1962 Award Recipients with Citations (PDF, 26 KB) John H. Hammond, Jr. 1963 George C. Southworth 1963 The IEEE Medal of Honor is the highest IEEE Harold A. Wheeler 1964 award. The Medal was established in 1917 and Claude E. Shannon 1966 Charles H. Townes 1967 is awarded for an exceptional contribution or an Gordon K. Teal 1968 extraordinary career in the IEEE fields of Edward L. Ginzton 1969 interest. The IEEE Medal of Honor is the highest Dennis Gabor 1970 IEEE award. The candidate need not be a John Bardeen 1971 Jay W. Forrester 1972 member of the IEEE. The IEEE Medal of Honor Rudolf Kompfner 1973 is sponsored by the IEEE Foundation. Rudolf E. Kalman 1974 John R. Pierce 1975 E. H. Armstrong 1917 H. Earle Vaughan 1977 E. F. W. Alexanderson 1919 Robert N. Noyce 1978 Guglielmo Marconi 1920 Richard Bellman 1979 R. A. Fessenden 1921 William Shockley 1980 Lee deforest 1922 Sidney Darlington 1981 John Stone-Stone 1923 John Wilder Tukey 1982 M. -
IEEE Information Theory Society Newsletter
IEEE Information Theory Society Newsletter Vol. 70, No. 3, September 2020 EDITOR: Salim El Rouayheb ISSN 1059-2362 President’s Column Aylin Yener We are quickly approaching the end of sum- experience from these reputed information mer and the reopening of campuses at least theorists. I am pleased to report that our Dis- across the US. As I write this, I find myself tinguished Lecturer program, is gearing up looking at the same view as I was when writ- to be back on, but with virtual lectures. The ing the previous columns, with perhaps a Board of Governors has extended the terms slight change (for the better) in the tempera- of the current lecturers by a year so that when ture and a new end-of-the summer breeze traveling resumes, the lecturers will have the that makes sitting in my balcony so much bet- opportunity to re-coupe the lost year in in- ter. I feel lucky once again to have an outdoor person chapter visits. For now, all chapters space; one I came to treasure much more than should consider inviting the lecturers for on- I did in BC (Before COVID). I realize I have line lectures, which can be facilitated on zoom forgotten the view from the office I got to sit or webex. I hope that the chapters and our in no more than ten days before the closure members will find this resource useful. in March and try to recall -to no avail- if any part of the city skyline was visible. I should We are continuing to live our lives online (al- have taken pictures I conclude. -
Principles of Communications ECS 332
Principles of Communications ECS 332 Asst. Prof. Dr. Prapun Suksompong (ผศ.ดร.ประพันธ ์ สขสมปองุ ) [email protected] 1. Intro to Communication Systems Office Hours: BKD, 6th floor of Sirindhralai building Wednesday 13:45-15:15 Friday 13:45-15:15 1 “The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.” Shannon, Claude. A Mathematical Theory Of Communication. (1948) 2 Shannon: Father of the Info. Age Documentary Co-produced by the Jacobs School, UCSD-TV, and the California Institute for Telecommunications and Information Technology Won a Gold award in the Biography category in the 2002 Aurora Awards. 3 [http://www.uctv.tv/shows/Claude-Shannon-Father-of-the-Information-Age-6090] [http://www.youtube.com/watch?v=z2Whj_nL-x8] C. E. Shannon (1916-2001) 1938 MIT master's thesis: A Symbolic Analysis of Relay and Switching Circuits Insight: The binary nature of Boolean logic was analogous to the ones and zeros used by digital circuits. The thesis became the foundation of practical digital circuit design. The first known use of the term bit to refer to a “binary digit.” Possibly the most important, and also the most famous, master’s thesis of the century. It was simple, elegant, and important. 4 C. E. Shannon: Master Thesis 5 Boole/Shannon Celebration Events in 2015 and 2016 centered around the work of George Boole, who was born 200 years ago, and Claude E. Shannon, born 100 years ago. Events were scheduled both at the University College Cork (UCC), Ireland and the Massachusetts Institute of Technology (MIT) 6 http://www.rle.mit.edu/booleshannon/ An Interesting Book The Logician and the Engineer: How George Boole and Claude Shannon Created the Information Age by Paul J. -
IEEE Information Theory Society Newsletter
IEEE Information Theory Society Newsletter Vol. 59, No. 3, September 2009 Editor: Tracey Ho ISSN 1059-2362 Optimal Estimation XXX Shannon lecture, presented in 2009 in Seoul, South Korea (extended abstract) J. Rissanen 1 Prologue 2 Modeling Problem The fi rst quest for optimal estimation by Fisher, [2], Cramer, The modeling problem begins with a set of observed data 5 5 5 c 6 Rao and others, [1], dates back to over half a century and has Y yt:t 1, 2, , n , often together with other so-called 5 51 c26 changed remarkably little. The covariance of the estimated pa- explanatory data Y, X yt, x1,t, x2,t, . The objective is rameters was taken as the quality measure of estimators, for to learn properties in Y expressed by a set of distributions which the main result, the Cramer-Rao inequality, sets a lower as models bound. It is reached by the maximum likelihood (ML) estima- 5 1 u 26 tor for a restricted subclass of models and asymptotically for f Y|Xs; , s . a wider class. The covariance, which is just one property of u5u c u models, is too weak a measure to permit extension to estima- Here s is a structure parameter and 1, , k1s2 real- valued tion of the number of parameters, which is handled by various parameters, whose number depends on the structure. The ad hoc criteria too numerous to list here. structure is simply a subset of the models. Typically it is used to indicate the most important variables in X. (The traditional Soon after I had studied Shannon’s formal defi nition of in- name for the set of models is ‘likelihood function’ although no formation in random variables and his other remarkable per- such concept exists in probability theory.) formance bounds for communication, [4], I wanted to apply them to other fi elds – in particular to estimation and statis- The most important problem is the selection of the model tics in general. -
Digital Communication Systems ECS 452
Digital Communication Systems ECS 452 Asst. Prof. Dr. Prapun Suksompong (ผศ.ดร.ประพันธ ์ สขสมปองุ ) [email protected] 1. Intro to Digital Communication Systems Office Hours: BKD, 6th floor of Sirindhralai building Monday 10:00-10:40 Tuesday 12:00-12:40 1 Thursday 14:20-15:30 “The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.” Shannon, Claude. A Mathematical Theory Of Communication. (1948) 2 Shannon: Father of the Info. Age Documentary Co-produced by the Jacobs School, UCSD-TV, and the California Institute for Telecommunications and Information Technology Won a Gold award in the Biography category in the 2002 Aurora Awards. 3 [http://www.uctv.tv/shows/Claude-Shannon-Father-of-the-Information-Age-6090] [http://www.youtube.com/watch?v=z2Whj_nL-x8] C. E. Shannon (1916-2001) 1938 MIT master's thesis: A Symbolic Analysis of Relay and Switching Circuits Insight: The binary nature of Boolean logic was analogous to the ones and zeros used by digital circuits. The thesis became the foundation of practical digital circuit design. The first known use of the term bit to refer to a “binary digit.” Possibly the most important, and also the most famous, master’s thesis of the century. It was simple, elegant, and important. 4 C. E. Shannon: Master Thesis 5 Boole/Shannon Celebration Events in 2015 and 2016 centered around the work of George Boole, who was born 200 years ago, and Claude E. Shannon, born 100 years ago. Events were scheduled both at the University College Cork (UCC), Ireland and the Massachusetts Institute of Technology (MIT) 6 http://www.rle.mit.edu/booleshannon/ An Interesting Book The Logician and the Engineer: How George Boole and Claude Shannon Created the Information Age by Paul J. -
Distance Path Because, 01 1 Regardless of What Happens Subsequently, This Path Will 0 × 4 Have a Larger Hamming 11 1/10 (1) 1 Distance from Y
Direct Minimum Distance Decoding y Suppose = [11 01 11]. y Find . y Find the message which corresponds to the (valid) codeword with minimum (Hamming) distance from . y ܠപൌො ݀ ܠപǡܡ ܠപ പ + + 89 Direct Minimum Distance Decoding + y Suppose = [11 01 11]. y Find . + y Find the message which corresponds to the (valid) codeword with minimum (Hamming) distance from . ࢟ = [ 11 01 11 ]. 00 0/00 0/00 0/00 For 3-bit message, 3 10 there are 2 = 8 possible codewords. We can list all possible codewords. However, here, let’s first try to work 01 on the distance directly. 11 1/10 90 Direct Minimum Distance Decoding y Suppose = [11 01 11]. y Find . y Find the message which corresponds to the (valid) codeword with minimum (Hamming) distance from . The number in parentheses on each ࢟ = [ 11 01 11 ]. branch is the branch metric, obtained 0/00 (2) 0/00 (1) 0/00 (2) by counting the differences between 00 the encoded bits and the corresponding bits in ࢟. 10 01 11 1/10 (1) 91 Direct Minimum Distance Decoding y Suppose = [11 01 11]. y Find . y Find the message which corresponds to the (valid) codeword with minimum (Hamming) distance from . ࢟ = [ 11 01 11 ]. b d(x,y) 0/00 (2) 0/00 (1) 0/00 (2) 00 000 2+1+2 = 5 001 2+1+0 = 3 010 2+1+1 = 4 10 011 2+1+1 = 4 100 0+2+0 = 2 01 101 0+2+2 = 4 110 0+0+1 = 1 11 1/10 (1) 111 0+0+1 = 1 92 Viterbi decoding y Developed by Andrew J.