Studying the Real World Today's Topics
Total Page:16
File Type:pdf, Size:1020Kb
Studying the real world Today's topics Free and open source software (FOSS) What is it, who uses it, history Making the most of other people's software Learning from, using, and contributing Learning about your own system Using tools to understand software without source Free and open source software Access to source code Free = freedom to use, modify, copy Some potential benefits Can build for different platforms and needs Development driven by community Different perspectives and ideas More people looking at the code for bugs/security issues Structure Volunteers, sponsored by companies Generally anyone can propose ideas and submit code Different structures in charge of what features/code gets in Free and open source software Tons of FOSS out there Nearly everything on myth Desktop applications (Firefox, Chromium, LibreOffice) Programming tools (compilers, libraries, IDEs) Servers (Apache web server, MySQL) Many companies contribute to FOSS Android core Apple Darwin Microsoft .NET A brief history of FOSS 1960s: Software distributed with hardware Source included, users could fix bugs 1970s: Start of software licensing 1974: Software is copyrightable 1975: First license for UNIX sold 1980s: Popularity of closed-source software Software valued independent of hardware Richard Stallman Started the free software movement (1983) The GNU project GNU = GNU's Not Unix An operating system with unix-like interface GNU General Public License Free software: users have access to source, can modify and redistribute Must share modifications under same license Many of the tools we use are GNU bash, emacs, gcc, gdb, make coreutils (ls, head, tail, sort, which, ...) Linus Torvalds Invented the Linux kernel Fast Forward GNU/Linux has become very popular Feedback loop: use Linux because it supports lots of hardware, contribute support for your hardware back Companies open source tools and libraries More users, more apps for their platform Where to get it Generally through version control glibc: https://sourceware.org/git/?p=glibc.git musl: https://git.musl-libc.org/cgit/musl busybox: https://git.busybox.net/busybox/ Many projects on GitHub gcc: https://github.com/gcc-mirror/gcc linux: https://github.com/torvalds/linux Of course, not limited to C/systems code Learning from real-world software New functions, syntax, concepts E.g. strtok, memmove Learning from real-world software Code quality Style standards and enforcement GNU Coding Standards Learning from real-world software Code quality Style standards and enforcement GNU Coding Standards Linux kernel coding style: First off, I’d suggest printing out a copy of the GNU coding standards, and NOT read it. Burn them, it’s a great symbolic gesture. -- https://www.kernel.org/doc/html/v4.10/process/coding-style.html Learning from real-world software Testing strategies and frameworks Busybox testsuite: https://git.busybox.net/busybox/tree/testsuite Learning from real-world software Development workflows Many small changes vs. large patches Code review Release cycles Best (or at least common...) practices "You can either hang out in the Android Loop or the HURD loop." Link Getting involved Install and use it Experiment with changes, new releases User communities Forums, mailing lists Ask questions, request features Bug/issue trackers See what kinds of issues come up Discussions about design, process, portability, ... Example: GRUB stack exploit static int grub_username_get (char buf[], unsigned buf_size) { unsigned cur_len = 0; int key; while (1) { key = grub_getkey (); [...] if (key == '\b') { cur_len--; grub_printf ("\b"); continue; } [...] } grub_memset( buf + cur_len, 0, buf_size - cur_len); [...] } -- http://hmarco.org/bugs/CVE-2015-8370-Grub2-authentication-bypass.html Example: Flash memcpy Description of problem: Strange sound when playing mp3 on website using flash (using Shockwave Flash 10.2 d161). ----- The trigger of the problem is the glibc version. [...] ----- valgrind? ----- Looking at the changelog for glibc-2.12.90-4 shows: * Fri Jul 02 2010 Andreas Schwab <[email protected]> - 2.12.90-4 - Improve 64bit memcpy/memmove for Atom, Core 2 and Core i7 I used chromium to run valgrind on the flash plugin [...] ==2100== Thread 9: ==2100== Source and destination overlap in memcpy(0x256d7170, 0x256d7570, 1280) ==2100== at 0x4A06A3A: memcpy (mc_replace_strmem.c:497) https://bugzilla.redhat.com/show_bug.cgi?id=638477 Example: Flash memcpy So in the kernel we have a pretty strict "no regressions" rule, and that if people depend on interfaces we exported having side effects that weren't intentional, we try to fix things so that they still work unless there is a maJor reason not to. So I'm disappointed glibc just closes this as NOTABUG. There's no real reason to do the copy backwards that I can see, so doing it that way is Just stupid. -- Linus Torvalds https://bugzilla.redhat.com/show_bug.cgi?id=638477 Example: Flash memcpy (In reply to comment #39) > The only stupidity is crap software violating well known rules that have > existed forever. Umm. Bugs happen. That's a fact. You can call it "crap software" all you like, but the thing is, if memcpy doesn't warn about overlaps, there's no test coverage, and in that case even well-designed software will have bugs. Then the question becomes one of "Why break it?" -- Linus Torvalds https://bugzilla.redhat.com/show_bug.cgi?id=638477 Aside: Licenses Myth: If code is published online, I can use it. WRONG! There are lots of issues with licensing and permitted use Permissive licenses (BSD, MIT, Apache) Few restrictions on use, modification, distribution Can use in proprietary software "Copyleft" licenses (GNU GPL) Must make source code of changes available Can only be integrated into projects with compatible licenses Many companies have "open source disclosures" with lots of interesting code Aside: Licenses No license = no permission "If you find software that doesn’t have a license, that generally means you have no permission from the creators of the software to use, modify, or share the software. Although a code host such as GitHub may allow you to view and fork the code, this does not imply that you are permitted to use, modify, or share the software for any purpose." -- https://choosealicense.com/ Learning from binaries You don't need the source code to learn how things work Can use the same techniques you've learned in 107 on real programs strings, valgrind, objdump/gdb Some new tools: ltrace, strace, file Takeaways You've learned lots of super practical skills Command line/unix, reading C code, inspecting programs through assembly Where to go from here Learn from the process and designs of others Use open source projects to run with your own ideas Get involved with a project that interests you.