Cycle Accurate Profiling with Perf

Total Page:16

File Type:pdf, Size:1020Kb

Cycle Accurate Profiling with Perf CycleCycle Accurate Accurate Profiling Profiling With With Perf Perf Paweł Moll <[email protected]> 1 The plan 2 The plan n Hardware n Linux perf n Let’s hack! 3 Hardware 4 ARM CoreSight System Interconnect (AXI) Cross Trigger Matrix (CTM) CTI CTI CTI DAP AXI-AP STM TPIU JTAG- Funnel SWJ-DP AP Cortex ETM Processor ETB APB-AP Debug APB Interconnect (APBIC) n Source n Bus n Sink 5 Processor trace n Embedded Trace Macrocell n Instructions n Data n Program Trace Macrocell n Only branches n Bandwidth n From 10Mbps to many Gbps per core n Non- (or low-) intrusive debug 6 Sinks n Embedded Trace Buffer n Dedicated, small SRAM n Flight recorder use case n Trace Port n External analyser n Large buffer n External (high speed) pins n Embedded Trace Router n Sinks data into main interconnect n Usually uses system DRAM n Consumes memory system bandwidth 7 ETM Protocol n Highly compressed data n Generated against program memory n Based on E/N atoms n eg. b1NEEEE00 up to 16+1 instructions n Branches n Only if not evident (eg. eg. B <imm>) n No address == previous address n Exceptions, instruction sets, processor state n Synchronisation packets n Data packets n PTM protocol n One bit per conditional branch 8 Issues n Requires memory contents for decompression n Multitasking OS n JIT engines n Self-modifying code (kernel runtime patching, kprobes, dynamic trace events) n Parallel and out-of-order excecution 9 Additional features n Filtering n Address n CONTEXTID n VMID n Trigerring n Address n DBG <imm> n Counter n Sequencer n Timestamping n Correlation (synchronisation) 10 Linux perf 11 Linux perf framework n PMU drivers n Many use cases, eg. statistics n Sampling profiler n Periodic PC (IP) sampling n Timer or PMU counter overflow interrupt n Typical sampling rate 1kHz (every 1ms) 12 Sampling profile n Statistical approximation of a process n Think analog/digital converter n Shannon’s theorem: “If a function x(t) contains no frequencies higher than B cps, it is completely determined by giving its ordinates at a series of points spaced 1/(2B) seconds apart.” 13 CoreSight Linux framework n Developed by Mathieu Poirier at Linaro n Based on 2012 code from Code Aurora n At v7 stage now (http://lwn.net/Articles/614232/) n Control trace components via sysfs n Enable sink, enable source, dump buffer contents n Separate decoders n Full series at http://git.linaro.org/kernel/coresight.git 14 Intel PT n “an exciting new feature coming in future processors” (2013) n Integrating with perf n Auxiliary buffers n Decoder integrated with user space tool n Kernel portions at v4 stage now (http://lwn.net/Articles/609010/) n Full series at https://github.com/virtuoso/linux-perf/tree/intel_pt 15 Let’s hack! 16 Particularly pathologic example n Rotate JPEG file / # time taskset 4 ./gm convert -rotate 90 in.jpg out.jpg real 0m 0.01s user 0m 0.01s sys 0m 0.00s 17 Can it go faster? / # time perf record -F 1000 -e cpu-clock \ taskset 4 ./gm convert -rotate 90 in.jpg out.jpg [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (~50 samples) ] real 0m 0.27s user 0m 0.14s sys 0m 0.13s n It was: real 0m 0.01s user 0m 0.01s sys 0m 0.00s 18 Report # Samples: 17 of event 'cpu-clock' # Event count (approx.): 17000000 # # Overhead Command Shared Object Symbol # ........ ....... ................. ........................ # 17.65% gm [kernel.kallsyms] [k] filemap_map_pages 5.88% taskset [kernel.kallsyms] [k] filemap_map_pages 5.88% gm gm [.] LocaleCompare 5.88% gm gm [.] forward_DCT_float 5.88% gm gm [.] encode_mcu_huff 5.88% gm gm [.] ycc_rgb_convert 5.88% gm gm [.] decode_mcu_DC_first 5.88% gm gm [.] decode_mcu_AC_refine 19 Report, cont. 5.88% gm gm [.] jpeg_fdct_16x16 5.88% gm gm [.] _IO_link_in 5.88% gm gm [.] malloc 5.88% gm gm [.] strncpy 5.88% gm [kernel.kallsyms] [k] lock_acquire 5.88% gm [kernel.kallsyms] [k] lock_release 5.88% gm [kernel.kallsyms] [k] unmap_single_vma 20 Let’s have a closer look… n Cortex-A7 ETM 3.5 n Instructions only n Cycle accurate n Captured with DStream & ARM DS-5 n 10MB of binary trace data 21 What can we see there? n Decoded into text format n 890MB file n What to look for: ELF Header: Entry point address: 0x96dd 8360: 000096dd 0 FUNC GLOBAL DEFAULT 6 _start 8845: 0012ac44 0 FUNC GLOBAL DEFAULT 9 _fini 22 Here we go 2859224 S:0x8000E4CC E28DD00C 0 ADD sp,sp,#0xc ret_to_user_from_irq 2859225 S:0x8000E4D0 E1B0F00E 1 MOVS pc,lr ret_to_user_from_irq Return from exception Timestamp: 1878241989137 S:0x000096DC F04F0B00 29 MOV r11,#0 <Unknown> Exception: PREFETCH_ABORT (11) 2859227 S:0xFFFF000C EA000443 21 B PRRR+16027512 ; 0xFFFF1120 <Unknown> Timestamp: 1878241989138 2859228 S:0xFFFF1120 E24EE004 18 SUB lr,lr,#4 <Unknown> 2859229 S:0xFFFF1124 E88D4001 3 STM sp,r0,lr <Unknown> 23 Here we go again 2878794 S:0x8000E4D0 E1B0F00E 1 MOVS pc,lr ret_to_user_from_irq Return from exception Timestamp: 1878241990817 2878795 S:0x000096DC F04F0B00 175 MOV r11,#0 <Unknown> 2878796 S:0x000096E0 F04F0E00 40 MOV lr,#0 <Unknown> 2878797 S:0x000096E4 BC02 4 POP r1 <Unknown> 2878798 S:0x000096E6 466A 1 MOV r2,sp <Unknown> [...] 2878804 S:0x000096F6 4B04 1 LDR r3,[pc,#16] ; [0x9708] = 0xB114687C <Unknown> 2878805 S:0x000096F8 F0DAFDF4 0 BL __libc_start_main ; 0xE42E4 <Unknown> S:0x000E42E4 E92D45F0 324 PUSH r4-r8,r10,lr __libc_start_main Exception: PREFETCH_ABORT (11) 2878807 S:0xFFFF000C EA000443 21 B PRRR+16027512 ; 0xFFFF1120 <Unknown> 24 Going down 9747820 S:0x0012AC44 E91D4008 153 PUSH r3,lr <Unknown> 9747821 S:0x0012AC48 E8BD8008 2 POP r3,pc <Unknown> 9747822 S:0x000E9AA2 E7BF 9 B __run_exit_handlers+28 ; 0xE9A24 __run_exit_handlers 9747823 S:0x000E9A24 6873 3 LDR r3,[r6,#4] __run_exit_handlers 9747824 S:0x000E9A26 EB061403 3 ADD r4,r6,r3,LSL #4 __run_exit_handlers 9747825 S:0x000E9A2A B173 1 CBZ r3,__run_exit_handlers+66 ; 0xE9A4A __run_exit_handlers 25 The end of the process 9747990 S:0x000FD536 4C12 2 LDR r4,[pc,#72] ; [0xFD580] = 0x64C55B39 _Exit 9747991 S:0x000FD538 E004 0 B _Exit+24 ; 0xFD544 _Exit 9747992 S:0x000FD544 4618 1 MOV r0,r3 _Exit 9747993 S:0x000FD546 F04F0CF8 1 MOV r12,#0xf8 _Exit 9747994 S:0x000FD54A F7E7F9C9 0 BL __libc_do_syscall ; 0xE48E0 _Exit 9747995 S:0x000E48E0 B580 1 PUSH r7,lr __libc_do_syscall 9747996 S:0x000E48E2 4667 2 MOV r7,r12 __libc_do_syscall 9747997 S:0x000E48E4 DF00 1 SVC #0x0 __libc_do_syscall Exception: SUPERVISOR_CALL (10) n #0xf8 is __NR_exit_group 26 Focus on the task n It’s all interesting… n …but the goal is to see a cycle accurate profile of the process n Limit data scope n Filter out kernel addresses (or cheat using entry/exit points) n Filter out other contexts (or cheat by protecting CPU) n Collate migrated data (or cheat by setting affinity) n Generate memory map with DSOs (or cheat by linking statically) n Convert into perf data stream 27 perf.data n Starts with header n Description of all events n Followed by data records … n selection of samples n by default: PERF_SAMPLE_IP, PERF_SAMPLE_TID, PERF_SAMPLE_TIME, PERF_SAMPLE_PERIOD n generated on every timer/counter interrupt n …interleaved with system information n eg. PERF_RECORD_MMAP, PERF_RECORD_COMM, PERF_RECORD_EXIT, PERF_RECORD_FORK 28 perf.data, cont. $ perf report -D [...] 0x1d0 [0x28]: event: 9 . ... raw event: size 40 bytes . 0000: 09 00 00 00 01 00 28 00 54 fd 05 80 00 00 00 00 ......(.T....... 0010: 3b 08 00 00 3b 08 00 00 46 fb 39 91 dd 46 00 00 ;...;...F.9..F.. 0020: 40 42 0f 00 00 00 00 00 @B...... 77917438212934 0x1d0 [0x28]: PERF_RECORD_SAMPLE(IP, 1): \ 2107/2107: 0x8005fd54 period: 1000000 addr: 0 ... thread: gm:2107 ...... dso: [kernel.kallsyms] 29 The Hack n Replace data records with trace-based ones n reducing number of samples, from 40 to 24 bytes per record (attribute modification needed) n generating multiple samples, one per cycle used (175 samples if instruction took 175 cycles to execute) n 192MB big perf.data 30 The Hack, cont. 0x1f0 [0x18]: event: 9 . ... raw event: size 24 bytes . 0000: 09 00 00 00 02 00 18 00 dc 96 00 00 00 00 00 00 ................ 0010: 3b 08 00 00 3b 08 00 00 ;...;... 0x1f0 [0x18]: PERF_RECORD_SAMPLE(IP, 2): \ 2107/2107: 0x96dc period: 1 addr: 0 ... thread: :2107:2107 ...... dso: <not found> 0x208 [0x18]: event: 9 . ... raw event: size 24 bytes . 0000: 09 00 00 00 02 00 18 00 dc 96 00 00 00 00 00 00 ................ 0010: 3b 08 00 00 3b 08 00 00 ;...;... 0x208 [0x18]: PERF_RECORD_SAMPLE(IP, 2): \ 2107/2107: 0x96dc period: 1 addr: 0 31 ... thread: :2107:2107 Result # Samples: 8M of event 'cycles' # Event count (approx.): 8012563 # # Overhead Command Shared Object Symbol # ........ ....... ................ .................................. # 8.98% gm gm [.] jpeg_idct_islow 7.25% gm gm [.] strncpy 6.60% gm gm [.] jpeg_fdct_16x16 4.40% gm gm [.] encode_mcu_huff 4.37% gm gm [.] decode_mcu_AC_refine 4.25% gm gm [.] LocaleCompare 3.73% gm gm [.] rgb_ycc_convert 3.25% gm gm [.] _int_free 32 Result, cont 3.17% gm gm [.] memset 3.15% gm gm [.] jpeg_gen_optimal_table 3.01% gm gm [.] malloc 2.63% gm gm [.] _int_malloc 2.63% gm gm [.] forward_DCT_float 2.59% gm gm [.] ycc_rgb_convert 2.43% gm gm [.] jpeg_fdct_float 2.29% gm gm [.] __pthread_mutex_unlock_usercnt 2.25% gm gm [.] __memcpy_neon 2.19% gm gm [.] pthread_mutex_lock 1.74% gm gm [.] ReadJPEGImage 1.72% gm gm [.] encode_mcu_gather 1.61% gm gm [.] WriteJPEGImage 1.39% gm gm [.] vfprintf 33 Result, cont 1.32% gm gm [.] consume_data 1.21% gm gm [.] forward_DCT 1.06% gm gm [.] RegisterMagickInfo 0.88% gm gm [.] UnregisterMagickInfo 0.87% gm gm [.] decode_mcu_AC_first 0.81% gm gm [.] strlen 0.73% gm gm [.] SyncCacheNexus 0.64% gm gm
Recommended publications
  • Kernel Boot-Time Tracing
    Kernel Boot-time Tracing Linux Plumbers Conference 2019 - Tracing Track Masami Hiramatsu <[email protected]> Linaro, Ltd. Speaker Masami Hiramatsu - Working for Linaro and Linaro members - Tech Lead for a Landing team - Maintainer of Kprobes and related tracing features/tools Why Kernel Boot-time Tracing? Debug and analyze boot time errors and performance issues - Measure performance statistics of kernel boot - Analyze driver init failure - Debug boot up process - Continuously tracing from boot time etc. What We Have There are already many ftrace options on kernel command line ● Setup options (trace_options=) ● Output to printk (tp_printk) ● Enable events (trace_events=) ● Enable tracers (ftrace=) ● Filtering (ftrace_filter=,ftrace_notrace=,ftrace_graph_filter=,ftrace_graph_notrace=) ● Add kprobe events (kprobe_events=) ● And other options (alloc_snapshot, traceoff_on_warning, ...) See Documentation/admin-guide/kernel-parameters.txt Example of Kernel Cmdline Parameters In grub.conf linux /boot/vmlinuz-5.1 root=UUID=5a026bbb-6a58-4c23-9814-5b1c99b82338 ro quiet splash tp_printk trace_options=”sym-addr” trace_clock=global ftrace_dump_on_oops trace_buf_size=1M trace_event=”initcall:*,irq:*,exceptions:*” kprobe_event=”p:kprobes/myevent foofunction $arg1 $arg2;p:kprobes/myevent2 barfunction %ax” What Issues? Size limitation ● kernel cmdline size is small (< 256bytes) ● A half of the cmdline is used for normal boot Only partial features supported ● ftrace has too complex features for single command line ● per-event filters/actions, instances, histograms. Solutions? 1. Use initramfs - Too late for kernel boot time tracing 2. Expand kernel cmdline - It is not easy to write down complex tracing options on bootloader (Single line options is too simple) 3. Reuse structured boot time data (Devicetree) - Well documented, structured data -> V1 & V2 series based on this. Boot-time Trace: V1 and V2 series V1 and V2 series posted at June.
    [Show full text]
  • Linux Perf Event Features and Overhead
    Linux perf event Features and Overhead 2013 FastPath Workshop Vince Weaver http://www.eece.maine.edu/∼vweaver [email protected] 21 April 2013 Performance Counters and Workload Optimized Systems • With processor speeds constant, cannot depend on Moore's Law to deliver increased performance • Code analysis and optimization can provide speedups in existing code on existing hardware • Systems with a single workload are best target for cross- stack hardware/kernel/application optimization • Hardware performance counters are the perfect tool for this type of optimization 1 Some Uses of Performance Counters • Traditional analysis and optimization • Finding architectural reasons for slowdown • Validating Simulators • Auto-tuning • Operating System optimization • Estimating power/energy in software 2 Linux and Performance Counters • Linux has become the operating system of choice in many domains • Runs most of the Top500 list (over 90%) on down to embedded devices (Android Phones) • Until recently had no easy access to hardware performance counters, limiting code analysis and optimization. 3 Linux Performance Counter History • oprofile { system-wide sampling profiler since 2002 • perfctr { widely used general interface available since 1999, required patching kernel • perfmon2 { another general interface, included in kernel for itanium, made generic, big push for kernel inclusion 4 Linux perf event • Developed in response to perfmon2 by Molnar and Gleixner in 2009 • Merged in 2.6.31 as \PCL" • Unusual design pushes most functionality into kernel
    [Show full text]
  • Linux on IBM Z
    Linux on IBM Z Pervasive Encryption with Linux on IBM Z: from a performance perspective Danijel Soldo Software Performance Analyst Linux on IBM Z Performance Evaluation _ [email protected] IBM Z / Danijel Soldo – Pervasive Encryption with Linux on IBM Z: from a performance perspective / © 2018 IBM Corporation Notices and disclaimers • © 2018 International Business Machines Corporation. No part of • Performance data contained herein was generally obtained in a this document may be reproduced or transmitted in any form controlled, isolated environments. Customer examples are without written permission from IBM. presented as illustrations of how those • U.S. Government Users Restricted Rights — use, duplication • customers have used IBM products and the results they may have or disclosure restricted by GSA ADP Schedule Contract with achieved. Actual performance, cost, savings or other results in IBM. other operating environments may vary. • Information in these presentations (including information relating • References in this document to IBM products, programs, or to products that have not yet been announced by IBM) has been services does not imply that IBM intends to make such products, reviewed for accuracy as of the date of initial publication programs or services available in all countries in which and could include unintentional technical or typographical IBM operates or does business. errors. IBM shall have no responsibility to update this information. This document is distributed “as is” without any warranty, • Workshops, sessions and associated materials may have been either express or implied. In no event, shall IBM be liable for prepared by independent session speakers, and do not necessarily any damage arising from the use of this information, reflect the views of IBM.
    [Show full text]
  • Linux Performance Tools
    Linux Performance Tools Brendan Gregg Senior Performance Architect Performance Engineering Team [email protected] @brendangregg This Tutorial • A tour of many Linux performance tools – To show you what can be done – With guidance for how to do it • This includes objectives, discussion, live demos – See the video of this tutorial Observability Benchmarking Tuning Stac Tuning • Massive AWS EC2 Linux cloud – 10s of thousands of cloud instances • FreeBSD for content delivery – ~33% of US Internet traffic at night • Over 50M subscribers – Recently launched in ANZ • Use Linux server tools as needed – After cloud monitoring (Atlas, etc.) and instance monitoring (Vector) tools Agenda • Methodologies • Tools • Tool Types: – Observability – Benchmarking – Tuning – Static • Profiling • Tracing Methodologies Methodologies • Objectives: – Recognize the Streetlight Anti-Method – Perform the Workload Characterization Method – Perform the USE Method – Learn how to start with the questions, before using tools – Be aware of other methodologies My system is slow… DEMO & DISCUSSION Methodologies • There are dozens of performance tools for Linux – Packages: sysstat, procps, coreutils, … – Commercial products • Methodologies can provide guidance for choosing and using tools effectively • A starting point, a process, and an ending point An#-Methodologies • The lack of a deliberate methodology… Street Light An<-Method 1. Pick observability tools that are: – Familiar – Found on the Internet – Found at random 2. Run tools 3. Look for obvious issues Drunk Man An<-Method • Tune things at random until the problem goes away Blame Someone Else An<-Method 1. Find a system or environment component you are not responsible for 2. Hypothesize that the issue is with that component 3. Redirect the issue to the responsible team 4.
    [Show full text]
  • On Access Control Model of Linux Native Performance Monitoring Motivation
    On access control model of Linux native performance monitoring Motivation • socialize Perf access control management to broader community • promote the management to security sensitive production environments • discover demand on extensions to the existing Perf access control model 2 Model overview • Subjects: processes • Access control: • superuser root • LSM hooks for MAC (e.g. SELinux) subjects • privileged user groups • Linux capabilities (DAC) • unprivileged users • perf_event_paranoid sysctl access and • Objects: telemetry data • Resource control: • tracepoints, OS events resource control • CPU time: sample rate & throttling • CPU • Memory: perf_events_mlock_kb sysctl • Uncore Objects • Other HW • File descriptors: ulimit -n (RLIMIT_NOFILE) system • Scope • Level user cgroups • process • user mode • cgroups • kernel kernel • system process • hypervisor hypervisor 3 Subjects Users • root, superuser: • euid = 0 and/or CAP_SYS_ADMIN • unprivileged users: • perf_event_paranoid sysctl • Perf privileged user group: -rwxr-x--- 2 root perf_users 11M Oct 19 15:12 perf # getcap perf perf = cap_perfmon,…=ep root unprivileged Perf users 4 Telemetry, scope, level Telemetr Objects: SW, HW telemetry data y Uncore • tracepoints, OS events, eBPF • CPUs events and related HW CPU • Uncore events (LLC, Interconnect, DRAM) SW events • Other (e.g. FPGA) process cgroup user Scope: Level: system kernel • process • user hypervisor • cgroups • kernel Scope Leve • system wide • hypervisor l 5 perf: interrupt took too long (3000 > 2000), lowering kernel.perf_event_max_sample_rate
    [Show full text]
  • Kernel Runtime Security Instrumentation Process Is Executed
    Kernel Runtime Security Instrumentation KP Singh Linux Plumbers Conference Motivation Security Signals Mitigation Audit SELinux, Apparmor (LSMs) Perf seccomp Correlation with It's bad, stop it! maliciousness but do not imply it Adding a new Signal Signals Mitigation Update Audit Audit (user/kernel) SELinux, Apparmor (LSMs) to log environment Perf variables seccomp Security Signals Mitigation Audit SELinux, Apparmor (LSMs) Perf seccomp Update the mitigation logic for a malicious actor with a known LD_PRELOAD signature Signals ● A process that executes and deletes its own executable. ● A Kernel module that loads and "hides" itself ● "Suspicious" environment variables. Mitigations ● Prevent mounting of USB drives on servers. ● Dynamic whitelist of known Kernel modules. ● Prevent known vulnerable binaries from running. How does it work? Why LSM? ● Mapping to security behaviours rather than the API. ● Easy to miss if instrumenting using syscalls (eg. execve, execveat) ● Benefit the LSM ecosystem by incorporating feedback from the security community. Run my code when a Kernel Runtime Security Instrumentation process is executed /sys/kernel/security/krsi/process_execution my_bpf_prog.o (bprm_check_security) bpf [BPF_PROG_LOAD] open [O_RDWR] securityfs_fd prog_fd bpf [BPF_PROG_ATTACH] LSM:bprm_check_security (when a process is executed) KRSI's Hook Other LSM Hooks Tying it all Together Reads events from the buffer and processes them Userspace further Daemon/Agent User Space Buffer Kernel Space eBPF programs output to a buffer process_execution
    [Show full text]
  • Implementing SCADA Security Policies Via Security-Enhanced Linux
    Implementing SCADA Security Policies via Security-Enhanced Linux Ryan Bradetich Schweitzer Engineering Laboratories, Inc. Paul Oman University of Idaho, Department of Computer Science Presented at the 10th Annual Western Power Delivery Automation Conference Spokane, Washington April 8–10, 2008 1 Implementing SCADA Security Policies via Security-Enhanced Linux Ryan Bradetich, Schweitzer Engineering Laboratories, Inc. Paul Oman, University of Idaho, Department of Computer Science Abstract—Substation networks have traditionally been and corporate IT networks are also fundamentally different. isolated from corporate information technology (IT) networks. SCADA protocols provide efficient, deterministic communi- Hence, the security of substation networks has depended heavily cations between devices [3] [4]. Corporate IT protocols upon limited access points and the use of point-to-point supervisory control and data acquisition (SCADA)-specific generally provide reliable communications over a shared protocols. With the introduction of Ethernet into substations, communications channel. pressure to reduce expenses and provide Internet services to Security-Enhanced Linux was initially developed by the customers has many utilities connecting their substation National Security Agency to introduce a mandatory access networks and corporate IT networks, despite the additional control (MAC) solution into the Linux® kernel [5]. The security risks. The Security-Enhanced Linux SCADA proxy was original proof-of-concept Security-Enhanced Linux SCADA introduced
    [Show full text]
  • Embedded Linux Conference Europe 2019
    Embedded Linux Conference Europe 2019 Linux kernel debugging: going beyond printk messages Embedded Labworks By Sergio Prado. São Paulo, October 2019 ® Copyright Embedded Labworks 2004-2019. All rights reserved. Embedded Labworks ABOUT THIS DOCUMENT ✗ This document is available under Creative Commons BY- SA 4.0. https://creativecommons.org/licenses/by-sa/4.0/ ✗ The source code of this document is available at: https://e-labworks.com/talks/elce2019 Embedded Labworks $ WHOAMI ✗ Embedded software developer for more than 20 years. ✗ Principal Engineer of Embedded Labworks, a company specialized in the development of software projects and BSPs for embedded systems. https://e-labworks.com/en/ ✗ Active in the embedded systems community in Brazil, creator of the website Embarcados and blogger (Portuguese language). https://sergioprado.org ✗ Contributor of several open source projects, including Buildroot, Yocto Project and the Linux kernel. Embedded Labworks THIS TALK IS NOT ABOUT... ✗ printk and all related functions and features (pr_ and dev_ family of functions, dynamic debug, etc). ✗ Static analysis tools and fuzzing (sparse, smatch, coccinelle, coverity, trinity, syzkaller, syzbot, etc). ✗ User space debugging. ✗ This is also not a tutorial! We will talk about a lot of tools and techniches and have fun with some demos! Embedded Labworks DEBUGGING STEP-BY-STEP 1. Understand the problem. 2. Reproduce the problem. 3. Identify the source of the problem. 4. Fix the problem. 5. Fixed? If so, celebrate! If not, go back to step 1. Embedded Labworks TYPES OF PROBLEMS ✗ We can consider as the top 5 types of problems in software: ✗ Crash. ✗ Lockup. ✗ Logic/implementation error. ✗ Resource leak. ✗ Performance.
    [Show full text]
  • A Research on Android Technology with New Version Naugat(7.0,7.1)
    IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 19, Issue 2, Ver. I (Mar.-Apr. 2017), PP 65-77 www.iosrjournals.org A Research On Android Technology With New Version Naugat(7.0,7.1) Nikhil M. Dongre , Tejas S. Agrawal, Ass.prof. Sagar D. Pande (Dept. CSE, Student of PRPCOE, SantGadge baba Amravati University, [email protected] contact no: 8408895842) (Dept. CSE, Student of PRMCEAM, SantGadge baba Amravati University, [email protected] contact no: 9146951658) (Dept. CSE, Assistant professor of PRPCOE, SantGadge baba Amravati University, [email protected], contact no:9405352824) Abstract: Android “Naugat” (codenamed Android N in development) is the seventh major version of Android Operating System called Android 7.0. It was first released as a Android Beta Program build on March 9 , 2016 with factory images for current Nexus devices, which allows supported devices to be upgraded directly to the Android Nougat beta via over-the-air update. Nougat is introduced as notable changes to the operating system and its development platform also it includes the ability to display multiple apps on-screen at once in a split- screen view with the support for inline replies to notifications, as well as an OpenJDK-based Java environment and support for the Vulkan graphics rendering API, and "seamless" system updates on supported devices. Keywords: jellybean, kitkat, lollipop, marshmallow, naugat I. Introduction This research has been done to give you the best details toward the exciting new frontier of open source mobile development. Android is the newest mobile device operating system, and this is one of the first research to help the average programmer become a fearless Android developer.
    [Show full text]
  • How Containers Work Sasha Goldshtein @Goldshtn CTO, Sela Group Github.Com/Goldshtn
    Linux How Containers Work Sasha Goldshtein @goldshtn CTO, Sela Group github.com/goldshtn Agenda • Why containers? • Container building blocks • Security profiles • Namespaces • Control groups • Building tiny containers Hardware Virtualization vs. OS Virtualization VM Container ASP.NET Express ASP.NET Express RavenDB Nginx RavenDB Nginx App App App App .NET Core .NET Core V8 libC .NET Core .NET Core V8 libC Ubuntu Ubuntu Ubuntu RHEL Ubuntu Ubuntu Ubuntu RHEL Linux Linux Linux Linux Linux Kernel Kernel Kernel Kernel Kernel Hypervisor Hypervisor DupliCated Shared (image layer) Key Container Usage Scenarios • Simplifying configuration: infrastructure as code • Consistent environment from development to production • Application isolation without an extra copy of the OS • Server consolidation, thinner deployments • Rapid deployment and scaling Linux Container Building Blocks User Namespaces node:6 ms/aspnetcore:2 openjdk:8 Copy-on- /app/server.js /app/server.dll /app/server.jar write layered /usr/bin/node /usr/bin/dotnet /usr/bin/javaControl groups filesystems /lib64/libc.so /lib64/libc.so /lib64/libc.so 1 CPU 2 CPU 1 CPU 1 GB 4 GB 2 GB syscall Kernel Security profile Security Profiles • Docker supports additional security modules that restrict what containerized processes can do • AppArmor: text-based configuration that restricts access to certain files, network operations, etc. • Docker uses seccomp to restrict system calls performed by the containerized process • The default seccomp profile blocks syscalls like reboot, stime, umount, specific types of socket protocols, and others • See also: Docker seccomp profile, AppArmor for Docker Experiment: Using a Custom Profile (1/4) host% docker run -d --rm microsoft/dotnet:runtime …app host% docker exec -it $CID bash cont# apt update && apt install linux-perf cont# /usr/bin/perf_4.9 record -F 97 -a ..
    [Show full text]
  • An Efficient and Flexible Metadata Management Layer for Local File Systems
    An Efficient and Flexible Metadata Management Layer for Local File Systems Yubo Liu1;2, Hongbo Li1, Yutong Lu1, Zhiguang Chen∗;1, Ming Zhao∗;2 1Sun Yat-sen University, 2Arizona State University fyliu789, [email protected], fhongbo.li, yutong.lu, [email protected] Abstract—The efficiency of metadata processing affects the file traditional file systems are difficult to fully utilize the metadata system performance significantly. There are two bottlenecks in locality. metadata management in existing local file systems: 1) Path The path lookup bottleneck also exists in KV-based file lookup is costly because it causes a lot of disk I/Os, which makes metadata operations inefficient. 2) Existing file systems systems, such as TableFS [11] and BetrFS [12], which store have deep I/O stack in metadata management, resulting in metadata in write-optimized KV databases. The reasons for the additional processing overhead. To solve these two bottlenecks, inefficiency of path lookup in these systems are: 1) Queries we decoupled data and metadata management and proposed are slow in the write-optimized databases in some cases. a metadata management layer for local file systems. First, we Most write-optimized databases optimize disk I/O by caching separated the metadata based on their locations in the namespace tree and aggregated the metadata into fixed-size metadata buckets write operations, but additional overhead (e.g. multi-level table (MDBs). This design fully utilizes the metadata locality and query and value update) is required for queries. In addition, di- improves the efficiency of disk I/O in the path lookup. Second, rectory/file entry lookups require logarithmic time complexity we customized an efficient MDB storage system on the raw to complete in these write-optimized index (like LSM-Tree storage device.
    [Show full text]
  • Confine: Automated System Call Policy Generation for Container
    Confine: Automated System Call Policy Generation for Container Attack Surface Reduction Seyedhamed Ghavamnia Tapti Palit Azzedine Benameur Stony Brook University Stony Brook University Cloudhawk.io Michalis Polychronakis Stony Brook University Abstract The performance gains of containers, however, come to the Reducing the attack surface of the OS kernel is a promising expense of weaker isolation compared to VMs. Isolation be- defense-in-depth approach for mitigating the fragile isola- tween containers running on the same host is enforced purely in tion guarantees of container environments. In contrast to software by the underlying OS kernel. Therefore, adversaries hypervisor-based systems, malicious containers can exploit who have access to a container on a third-party host can exploit vulnerabilities in the underlying kernel to fully compromise kernel vulnerabilities to escalate their privileges and fully com- the host and all other containers running on it. Previous con- promise the host (and all the other containers running on it). tainer attack surface reduction efforts have relied on dynamic The trusted computing base in container environments analysis and training using realistic workloads to limit the essentially comprises the entire kernel, and thus all its set of system calls exposed to containers. These approaches, entry points become part of the attack surface exposed to however, do not capture exhaustively all the code that can potentially malicious containers. Despite the use of strict potentially be needed by future workloads or rare runtime software isolation mechanisms provided by the OS, such as conditions, and are thus not appropriate as a generic solution. capabilities [1] and namespaces [18], a malicious tenant can Aiming to provide a practical solution for the protection leverage kernel vulnerabilities to bypass them.
    [Show full text]