Catalogue of Technologies

2020

contents 60 58 56 53 50 48 46 43 41 39 37 35 33 31 29 27 24 22 20 18 16 13 8 7 5

ISP RAS: aninnovationISP RAS: ecosystem Trawl: analysis platform binarycode Texterra: analyzer semantic Talisman: framework adataprocessing Svace analyzer static exploratorySciNoon: search system for groups scientific Retrascope: analysis of static Protosphere anetwork traffic analyzer ISP Obfuscator test programMicroTESK: generator for designing support highly reliable systemsMasiw: virtuallaboratoryLingvodoc: endangered for documenting forKlever: technology Cprograms the checking model digitaltwinplatform DigiTEF: structure adocument retrievalDedoc: system ISP Crusher:adynamicanalysis toolset indexing, 4D: of searching technology Constructivity and Casr: crash analysis andseverity reporter tool analysis tool static BinSide: binarycode AstraVer: verification toolset family andcloudsolutions Asperitas 2020:TechnologiesISP RAS Update 2020:MajorEventsISP RAS Victor Ivannikov anniversary 80th (1940–2016) analysis of largeanalysis of spatial-temporal data TECHNOLOGIES ISP RAS RAS ISP languages software analysis platform on based HDL descriptions

QEMU

Catalogue of Technologies

(1940-2016) 80th anniversary Victor Ivannikov ISP RAS Director.ISP RAS RAS, the of academician and Mathematics, Physics of Doctor Avetisyan Arutyun

place of two money counterflows, flow of basedplace ahigh-quality Victor Petrovich ameeting became was ISPRAS saying that partners were institute. more launched, the studentsjoined time.environment Thenfirst foreign the jointprojects with of ISP first years RAS were hard dueto Russianeconomic and creative atmosphere IPMCE. of Lebedev’s outof inanopen born school school his scientific andachievements: hisexperience of and gave best itthe institutes. Hecreated in1994 suchaninstitute Sciences of by handled appropriateonly be organizations, e.g. Academy research anddeployment. taskcan Andthis state-of-the-art area knowledge requiring developmenttion constant of and Victor Ivannikov system knew programming that isaninnova software. developing anddeploying CAD systems andsupercomputer where he managed Sciences, Academyof the of Cybernetics tem development. 80sheworked of Institute Later inthe inthe plex inmissioncontrol used centers, itsoperating sys andled ­developing AS-6, real-time com amultiprocessor computing a famous mainframe Soviet computer. partin Thenhetook creating project anoperating system of the for BESM-6, (IPMCE),he joined andComputer Engineering sion Mechanics of Precigraduating Institute from atthe Phystech, whilebeing firstprogrammers. Soviet worked andbefriended with After SergeyLebedev,of computerpioneer. aSoviet Victor Petrovich Ivannikov’seducation. teacher was Lev Korolev fromchair the year, developing andnotbreaking new technologies upwith bureaus 3rd anddesign Academy institutes their since of founded. toprojects Itsstudentsused startworking onthe Victor Ivannikov entered Phystechafter itwas (MIPT)soon was his towards attitude around hiswork him. andpeople him about and innovations. thing mostimportant Butthe famously known asPhystech triangle:research, education, shortly.describe away work school, of ascientific Itincludes development to ISPRAS isvery Hiscontribution hardhim. to founder andfirst director, after isnamed institute andnow the Ivannikov’s birthday. Ivannikov Academician was ISPRAS the In 2020we commemorate 80years Victor since Petrovich - - - -

5 Catalogue of Technologies on the contracts for advanced research and development and a flow of talented young engineers. The institute started to stabilize in the 2000s. Intel, HP and became its part- ners. New research directions appeared, such as program analysis for vulnerability detection, binary code analysis, natural language processing. Samsung, the new ISP RAS long-term partner, opened a joint lab with the institute in 2009. We have started working with Russian companies and grew by a lot.

Leading the institute took a lot of time and attention, but Victor Petrovich caught up with everything. He advised stu- dents and postgraduates and continued to teach. He was of the opinion that teaching should be done by engineers with real work experience, and he provided his experience freely to MSU and Phystech students. He always said that people are most important. And now his students are giving lectures and advising young students in the institute.

Time shows that everything taught and said by Victor Petrovich is working. We have created an ecosystem that constantly generates high-skilled specialists and innovations in the area of system programming. We are growing and improving. We make new technologies. Within the last five years ISP RAS grew twofold and our budget increased threefold. And the ba- sis of all our achievements is the attitude taught to us by Victor Petrovich — we live the work.

6 Major Events ISP RAS2020: —  —  —  —  —  —  —  —  —  Open conference.Open OS Day conference, Colloquium, Engineering andISPRAS were heldonlineforfirst time, the namelySYRCoSE Software publicevents coronavirus ISPRAS the pandemicthree of five outof conferences Holding onlinethree online. Because byaged ISPRAS, now inOrel University. State fourthexternal the systemOpening programming labman 30%comparedto andstudentsgrowth 2019. isabout Staff upto payments 200.000 canbe ship monthly rubles. in our frontlineindustrialresearch anddevelopment. Scholar The goal isinvolving talented studentsandpostgraduates ascholarship programLaunching inMSU, MIPTandHSE. cloudsolutions. such asAsperitas technologies existing ISPRAS onthe platform based willbe for retrieving, The storing, bigdata. andanalyzingmedical Theinstitute’stions). key taskiscreating adigitalplatform University institu Sechenov andother with centers (jointly Winning acallfor tender for scientific organizing world-class means. alsoquantumcommunication supporting for ensuringfederaloping new methods datacenters security 95.5 score of the projects getting 100. outof Itsgoalisdevel jointproject Institute wastopThe ISPRAS-Steklov ten inthe RAS). of Institute Steklov Mathematical with tender (jointly callfor andHigherEducation Science Winning aMinistryof Swemel, andothers. Code, RusBITech, Kaspersky Lab., Postgres Professional, suchasSecurity Russiancompanies within cle technologies afullscaledeploymentsecure development of Starting lifecy has started. HarmonyOS to systems -developed applying those practices foranalyzing best and OS verification andtesting Huawei. Theproject with on collaboration Expanding the andprogramengineering analysis. 2019. We have started developing for AItechnologies software tronics. Thejointlabfinancingincreased by 20%compared to Elec Samsung with collaboration decade-long Expanding the ------

7 Catalogue of Technologies ISP RAS 2020: Technologies Update

2020 has seen a number of important scientific results that served as a basis for new technologies or were the outcome of advancing existing ones. The results were recognized by the scientific community. Most notably, Trawl binary code analysis platform was included in the list of Academy of Sciences’ best results. Alexander Sergeev, President of RAS, said that Trawl plays an important role in ensuring cybersecurity.

Most important achievements of 2020 include:

— The Svace static analyzer now supports Kotlin and Go. Ana- lyzing large projects takes up to 35% less time. Taint analysis quality has been improved. Various C/C++ compilers support has been improved as well, in particular support for Visual C++. Analyzing programs built for RISC-V and Power- PC64 architectures is now supported, and the analyzer itself can work on ARM64 servers. — The Talisman social network analysis framework has been transformed to a full-scale data processing framework. It pro- vides a fast development of specialized multi-user analytical systems, which can merge and work uniformly with the data from private and Internet sources. — Dynamic analysis technologies have been improved and uni- fied in a single toolset called ISP Crusher. The toolset includes ISP Fuzzer, a fuzzing tool, and SyDr, an automatic test genera- tion tool for complex programs. — The Asperitas platform for data storage and complex re- source-intensive on demand calculations has been com- plemented with new tools. They include Michman, a PaaS services orchestrator for a cloud environment doing big data analysis and storage, machine learning, and running distribut- ed programs. The other new tool is Clouni, a tool for translat- ing TOSCA Simple Profile YAML patterns to IaaS deployment scenarios based on Ansible. — The newly developed Dedoc tool is a universal open system for extracting document structure and converting documents to a unified format. — The new Casr tool creates automatic reports for crashes hap- pened during program testing or deployment.

8 —  —  —  platforms suchas: main-specific form technologies integrated ISPRAS andother do These —  —  —  Dedoc tool for tool extractingDedoc structure. document Texterrathe platform for text extraction, semantic andthe exploratory SciNoon the tool, searchprocessing system, Data Talisman analysis platform the is formed of data dustrial devices. DigiTEF platform for creating of in complex digitalmodels aswell as the cloudsolutions vironment ISPRAS and other Distributed systems platform clouden Asperitas includes crash andISPObfuscator. analysis andreporting, set, AstraVer for Casrtool the verificationtoolset, deductive analyzers,BinSide static ISPCrusherdynamicanalysis tool development lifecycleSecurity platform Svace includes and 16cities. centers of guists in29universities andscientific endangered languages, of isnow moretion by lin widelyused Lingvodoc, asystem for documenta collaborative multi-user qualityandGhidra detection better disassemblerintegration. binaryanalysisinclude tool BinSidestatic the of Updates execution traces. onadynamictaintanalysis overdata leaks based detection new featureAn important for is support in-memorysensitive The Trawl analysis improved. platform binarycode hasbeen ------

9 Catalogue of Technologies

Technologies 11 Catalogue of Technologies

ENVIRONMENT ASPERITAS CLOUD FAMILY CLOUD SOLUTIONS ASPERITAS ORCHESTRATOR A UNIVERSAL MICHMAN, —  —  —  —  —  Manager, aswell asfor monitoring deployed applications have forwill soon asupport Kubernetes andSlurmWorkload clusters inreal actively developed and time. isbeing The tool deployed and projects status of ed andmonitor services the fullysetup machines. Userswith cancreate clusters inisolat cluster automatic deployment Itsupports data. amounts of running distributed programs onclusters andstoring large environment performing bigdata analysis, machinelearning, Michman isaPaaS orchestration services for tool acloud provides:Asperitas for deployment launching the process. tools machine prepared necessary inadvance hasallthe that deploying acloud environment from sources asavirtual local building large private clouds. Itimplementsanapproach of are that ubiquitous source for technologies open on modern using large available amount of resources. isbased Asperitas for Itisdesigned computations Dell. in a jointproject with and iscreated andCeph onOpenstack isbased Asperitas familycloud solutions Register asNo. (listed inthe 6066). Fanlight, aweb laboratories ISPRAS platform, isalsoapart of aPaaSman, orchestrator, andClouni,anIaaSorchestrator. RussianPrograms) Register Unified of in the aswell asMich a cloudenvironment (listed asNo. Asperitas alsocalled 5921 plex ondemand.Itincludes resource-intensive calculations isaplatformAsperitas for datastorage andperforming com detection etc.).detection mechanics, bigdataanalysis, program analysis for defect problem to (e.g. classes specific Adaptation continuum distributed filesystem;Ceph storageBlock storage andscalable object onthe isbased using Keystone, Neutron andNova (similarto AmazonEC2); Virtual clusters networks management andcomputational increase know-howbase anduses that security); solutions environment (the asmaller code High security isbuiltontop of research);ISP RAS standards open and software usageof aswelldue to as the andfullycontrolled inanisolatedenvironment installed be provided (the infrastructure option An onsite installation can AND - - - 13 Catalogue of Technologies Michman is able to deploy virtual clusters with PaaS services on demand, including: — A big data analysis cluster with arbitrary number of nodes having Apache Spark, Apache Hadoop, Apache Ignite and Jupyter Notebook fully set up and ready to work; — A for storing large data such as PostgreSQL, Apache Cassandra, CouchDB, ClickHouse or Redis. Some of the databases can be deployed in a distributed mode; — NextCloud storage and file exchange system.

Michman provides:

— A service for users and isolated groups management with REST API; — Storing the data regarding deployed clusters and services, their status, and access points; — Storing cluster templates ready for deployment; — Deploying complex distributed systems on demand witharbi- trary service combinations; — Managing service dependencies including per-version depend- encies; — Local deployment without Internet access; — Storing detailed data regarding available services, their ver- sions and parameters; — Easy service addition using REST API; — Integration with IaaS cloud virtualization systems.

The Asperitas distribution also includes Michman and Clouni (https://github.com/ispras/clouni), a tool for translating TOSCA Simple Profile YAML patterns to IaaS deployment scenarios based on Ansible.

FANLIGHT Fanlight is a web laboratories platform created as a result of ISP RAS participation in the University Cluster program and in the Open Cirrus international project (founded by Hewl- ett-Packard, Intel and Yahoo!). It is designed for deploying a SaaS infrastructure for a computational web laboratory using Docker Compose. Fanlight is built on a container technology and provides virtual workspaces in the Desktop as a Service model (DaaS). It is available at https://fanlight. ispras.ru and supports applications developed for a kernel based OS.

Fanlight provides:

— High performance cloud computing through the use of con- tainers: — Working comfortably with heavy CAD-CAE engineering applications that require 3D graphics hardware accelera- tion support for complex ; — Support for running MPI, OpenMP, CUDA applications via HPC clusters, multi-core processors and NVIDIA graphics accelerators. — PaaS-provided computing capability expansion via adding hardware resources (HPC / BigData clusters, storage systems, servers with GPUs); — Application area specific customization via integrating specialized computing application packages. The following 14 stories deployment Cloud solutions RAS Laboratory of Continuum Mechanics (https://unicfd.ru). Mechanics Laboratory Continuum RAS of raw ISP Uniondrocarbon State), the materials resourceof potential (developing hy for the atechnology increasing andusingefficiently Mathematics Keldysh Applied RRS-Baltika, of Physics, Institute OOO Research Experimental of Institute All-RussianScientific the of laboratory deployment, includingRussianFederal Nuclear Center jointprojectsThe Fanlightplatform for of was web inanumber used RussianFederation. of Science Ministryof the with jointly projects isperformed of anumber OS images. of Inaddition, testing regular andautomating buildand development OS components of Tizen OS lifecycle infrastructure support allows organizing that joint Huawei (large graphs andthe analysis usingbigdataprocessing) The following projects were alsoimplemented:ajointproject with (e.g. technologies ISPRAS other analyzingAndroid OS usingSvace). Talismanflows inthe analysis framework media social andsupports information analyzes clusteronAsperitas based The computing —  —  (IaaS) or in a local data processing center. dataprocessing (IaaS) orinalocal deployedCan be onaserver, farm, acomputing inacloud clientsoftware;tional addi without Allows client(includingmobiledevices) any thin — — systems were deployed: successfully

In aCFDarea: OpenFOAM, SALOME, Paraview etc.; In a Gas&Oil industry: tNavigator,In aGas&Oil industry: Roxar, Eclipse, Tem pest etc.pest - - - 15 Catalogue of Technologies ASTRAVER: A VERIFICATION TOOLSET

AstraVer Toolset is a deductive verification system for key software components. It allows developing and verifying security policy models as well as proving the correctness of software modules written in the C programming language. Astraver is essential for ensuring the required trust levels from ADV_SPM and ADV_FSP assurance families as defined in the ISO/IEC 15408 standard.

Features and AstraVer Toolset is a set of tools designed for industrial use. advantages It is based on many years of scientific research and com- bines two verification approaches: at the model level and at the code level. Parts of the AstraVer Toolset are similar to Microsoft VCC and Frama-C WP, but unlike those Astraver is specifically designed to support the key security compo- nents’ verification in the Linux kernel. AstraVer Toolset is free and open source: http://linuxtesting.org/astraver.

AstraVer — An integrated approach to verification, supporting the formal- provides: ization of high-level requirements and analyzing the C source code behavior; — Modeling and formalizing functional requirements, proving internal consistency and unreachability of insecure states; — Testing whether functional requirements are satisfied in an implementation, using their formal models to check the cor- rectness of the observable behavior and to evaluate the quality of testing and generated testcases; — Verification of critical components written in C (requirements’ formalization, correctness proof on all possible input values); — Support for real industrial C code (GCC compiler extensions, arithmetic operations with bitwise precision, address arith- metic including the container_of intrinsic, function pointers, casting); — Adhering to the protection profile requirements (ISO/IEC 15408), such as — formal security policy modeling; — formal verification of internal consistency of a security policy model; — formal proof that the target system cannot reach an inse- cure state; — a formal or a semi-formal functional specification devel- opment; — a formal/semi-formal proof of correspondence between the security policy model and the functional specification; — a formal/semi-formal proof of correspondence between different representations of target software, like function- 16 al specification, design and source code. workflow AstraVer stories deployment AtstraVer audience? target AstraVer Who is Automatic verification Manual development requirements Linux kernel Functional Functional security security API

etfcto laboratories. Certification — — — — ongoing. features work model fornew security cation the isconstantly were verified usingAstraVer successfully Toolset. Theverifi control source mechanisms code access andthe model policy security the 2Aprotection profile. the systems Both of requirements, security tion for whichare operating defined FSTEC the informa for with compliance certification the BITech As JSC). Astra aresult, this haspassed Linuxedition (RPAcontrol for mechanisms Astra Edition LinuxSpecial Rus AstraVer Toolset development was access inthe used of by ISO/IEC the 15408standard; railway,in aviation, andnuclear medical power industries; verification. components Csource code the Companies that need certification of their software as guided software their asguided of certification need that Companies developing systems,Companies critical includingsoftware customer for to toolset performAbility to aspecific adjustthe Deductive verification of verificationof components Deductive Security policy model policy Security Formal functional functional Formal specification Pre- and postconditions of LSM operations of Pre- andpostconditions Deductive verification of security models security verificationof Deductive Specification of library of functions Specification LSM-level requirements Formal design of LSM of design Formal Linux Security Linux Security Custom LSM Linux kernel LSM design Module - - - 17 Catalogue of Technologies BINSIDE: A BINARY CODE STATIC ANALYSIS TOOL

BinSide is a static program analysis tool for finding defects in binary code. It is useful when checking programs without source code, such as closed source 3rd party libraries, as well as assisting with required static information to dynamic analysis tools.

Features and BinSide is a binary code analysis platform based on the BinNa- advantages vi framework, which translates assembler code into the REIL representation. REIL allows analyzing binary code in a target processor and OS independent way. BinSide is integrated with the IDA Pro and Ghidra disassemblers that are widely used for reverse engineering.

BinSide — Easy extension: provides: — individual error detectors are written as plugins; — the REIL representation of 17 instructions without side effects is used (each assembly instruction is translated into a set of REIL instructions); — marking functions in IDA Pro as having tainted data sources. — Supports analyzing executables for -64, ARM and MIPS. — Detecting the following CWE types: CWE-121 (Stack-based Buffer Overflow), CWE-122 (Heap-based Buffer Overflow), CWE-134 (Use of Externally-Controlled Format String), CWE- 415 (Double Free), CWE-416 (Use After Free). — An analysis kernel with the following features: — value and pointer analysis; — tracking tainted data, static and dynamic memory models, as well as data flow and control flow analysis — finding errors on all execution paths (including those not covered by testing or dynamic analysis); — Converting result to the Svace analyzer format (in the pres- ence of debug info) for displaying errors in a common web interface and providing navigation in source code.

BinSide plugins include:

— A function semantics retrieval plugin. — Plugins based on code clone search technology: — The libraryIdentifier plugin that allows: — detecting usage of legacy libraries; — searching for copyright infringements.

18 workflow BinSide binary code assembler code IDA — — file to another. versions; A plugin for function names markuptransferA pluginfor names function from onebinary different between A pluginfor changes detecting software Bin Export PostgreSQL database REIL specification potential analysis Plugins defect 19 Catalogue of Technologies Casr: ­­ crash analysis and severity reporting tool

Casr creates automatic reports for crashes happened during program testing or deployment. The tool works by analyz- ing Linux coredump files. The resulting reports contain the crash’s severity and additional data that is helpful for pin- pointing the error cause.

Features and Casr solves the same problem as Apport, an open source advantages system, but in contrast with it, Casr estimates a crash’s sever- ity and provides a list of open files and network connections at a crash time.

Casr provides: — Detecting critical program faults that can lead to hijacking control flow. — Classifying crashes into one of 23 classes based on a pro- gram state at a crash time (function return address corruption, null pointer dereference etc.). Fatal errors are further grouped based on severity, such as exploitable, potentially exploitable, or denial of service errors. — Collecting a list of open files and network connections that could be the reason of a crash. — An extended crash report containing the fatal error’s severity and other data (OS and package versions, executed com- mand line, call stack, open files and network connections, register state etc.). — Reports for hard to reproduce crashes such as non-determin- istic errors, cases when the original execution environment is hard or impossible to reconstruct etc.). — Integration with monitoring tools (such as Zabbix) that allows system administrators and devops to receive on the fly notifi- cations about critical program crashes.

Who is Casr — Companies that need to receive the data regarding user-de- target ployed programs’ crashes to develop high reliability and audience? security software. — Companies that need to certify the developed software. — Certification laboratories.

Casr deployment Casr is deployed in a number of Russian companies and ven- stories dors as an add-on tool to ISP Crusher. Casr will be included in the Crusher distribution within the next two years.

20 Casr workflow requirements System System information Coredump Coredump extraction collection Call stack parsing Package information Process analysis collection Data structures being analyzed being ployed asadebpackage. Linux-based OS for x86, orARMv7. x86-64, de Casrcanbe mapped files mapped call stack siginfo_t prstatus prpsinfo Collecting opened opened Collecting files andnetworkfiles connections Severity estimation taint analysis Lightweight Instruction Instruction Exception Call stack analysis analysis analysis Report - 21 Catalogue of Technologies CONSTRUCTIVITY 4D: A TECHNOLOGY OF INDEXING, SEARCHING AND ANALYSIS OF LARGE SPATIAL-TEMPORAL DATA

Constructivity 4D is a technology for creating innovative software services that are capable of processing highly dynamic scenes and vast arrays of spatial and temporal data. It performs visual analysis of millions of objects with individu- al geometry and dynamic behavior. Constructivity is deployed within the Synchro system that is used for 4D modeling of extremely large construction sites.

Features and Constructivity 4D is a production level technology that puts advantages together original methods of spatio-temporal indexing, search and qualitative and quantitative data analysis. Devel- oped methods account for the specifics of objects’ geomet- ric representation, complex organization and the apriori known nature of their dynamic changes.

Constructivity — Support for a well-developed set of operations: 4D provides: — Temporal operations implement classical interval algebra introduced by Allen with respect to time stamps of dis- crete events and their intervals; — Metric operations allow determining the individual prop- erties of geometric objects and the characteristics of their mutual arrangement. Diameter, area, volume, center of mass, planar projections, and distances between ob- jects can be calculated for solid geometric objects; — Topological operations are intended to classify the rela- tive location of objects and to establish the facts of their coincidence, intersection, coverage, touch, overlap or collision. In contrast with known topological models such as DE-9IM, RCC-8, RCC-3D, these operations allow constructive implementation and are applicable for the analysis of complex objects; — Orientational operations generalize known Frank’s and Freksa’s relative orientation calculi, cardinal direction cal- culi (CDC), oriented point relation algebra (OPRA) and are

22 stories 4D deployment Constructivity audience? 4D target Constructivity Who is —  —  —  —  —  —  in more than 300 companies in36countries. in more 300companies than and infrastructure areas, aswell asothers. Synchro isused construction industrialprojects inthe large-scale ment of for planningandmanage is designed visual4D-modeling, Synchro software system (https://www.synchroltd.com) that deployed the successfully within hasbeen The technology projecttics, managementandscheduling. andmanufacturing design logis robotics, tion, automation, visualiza geoinformatics,scientific graphics andanimation, in vastly different fields, includingbutnotlimitedto: computer for isused systems creatingThe technology application applications. innew software development applications both andinlegacy Various for options extending library used the itcanbe sothat planning. usageonpath its concerted and information from 3Dscenes representation geometric of onextracting ment isbased metricandtopological spatial, for inglobaldynamic environ navigation An originalmethod ries. spatial-temporalspecifying dataandexecuting typicalque extensible classes, interfaces setof for andrelated methods libraryAn object-oriented including implementedinC++that coherence methods. hierarchies of volumes, bounding methods of temporal sition, decompo usingspatial methods localization collision nation, for precise determi collision methods combines that scenes strategyA hybrid computational for determining in collisions trees. clusterobject trees, occupation space trees, volume bounding trees,trees, decomposition spatial A spatial-temporal indexing system includingbinaryevent ronment are resolved. effectively collisions, andconflict-free routing inaglobaldynamicenvi finding nearest neighbors, anddynamic determining static intime, retrievingpoint inagivenregion, objects spatial in particular, atagiven for ascene queries reconstructing queryexecution andtypicalproblemsEfficient solving, boundaries. extended with objects applicable for analysis of the ------23 Catalogue of Technologies ISP CRUSHER:­­ A DYNAMIC ANALYSIS TOOLSET

ISP Crusher is a toolset that combines various dynamic analysis approaches. It includes ISP Fuzzer, a fuzzing tool, and SyDr, an automatic test generation tool for complex pro- grams. Two other ISP RAS analyzers, BinSide and Casr, will be included in Crusher within the next two years. Crusher allows organizing a development process that is fully compliant with GOST R 56939-2016 and other regulatory requirements of FSTEC of Russia.

ISP FUZZER, ISP Fuzzer is a fuzzing tool that is essential for any software A TESTING TOOL development phase, be it coding, testing, or deployment. The fuzzer finds errors and backdoors in programs either with or without source code. It solves the same problems as its global competitors (Synopsys Codenomicon, beSTORM, Peach Fuzzer), but it is more convenient for Russian compa- nies in the import substitution context.

ISP Fuzzer — Fuzzing various input data sources (files, command line provides: arguments, standard input stream, environment variable arguments, network sockets); — Adding custom mutational transformations (for new input data generation and increasing fuzzing efficiency); — Input data pre- and post-processing plugins for perform- ing data transformations before feeding the data to the software being analyzed; — Multicore and distributed parallel analysis support; — Custom network plugins support allowing to interact with a client or a server and to send mutated data; — Integration with other ISP RAS security development lifecycle tools, such as: — using Anxiety or SyDr dynamic symbolic execution analyzers for improving fuzzing efficiency; — generating input data that triggers errors found by the BinSide static analysis tool; — using Svace analyzer GUI to show a function call trace that resulted in a crash; — using the ANTLR parser generator to create an input data corpora. — Integration with the IDA PRO disassembler (exporting coverage data to the Lighthouse plugin for displaying covered basic blocks and showing the percentage of the coverage achieved); — Client/server software analysis operating via stateless or stateful protocols; — Running dynamic analyzers on generated input (Valgrind, DrMemory, QASan); — Easy extension via adding new analysis algorithms to the existing infrastructure as well as fast adaptation to new 24 fuzzing problems; SyDr provides: EXECUTION TOOL SYMBOLIC SYDR, ADYNAMIC requirements System audience? Crusher target Who isISP stories deployment ISP Crusher —  —  —  —  —  —  —  —  —  —  —  —  —  —  Security Code, Swemel andothers. Security labs,fication includingRusBITech, Postgres Professional, andcerti in more20companies ISP Crusherisused than andCOMobjects.dows services (controllers, devices as well IoTdevices) embedded asWin Linux andWindows OS family Crushercanalsofuzz support. branchestional inverted. newly found target the execution hasthe condi whether path generated of correctness ensures inputdataby the checking in ISPRAS. source Incontrast tools, similaropen SyDr with earlierused inAvalanche andAnxietyanalyzers developed execution dynamicsymbolic approach the advances further hard to approaches. follow mutation Thetool viagenetic allows afuzzertothat explore are new execution that paths program’s the SyDrconstructs testing. model mathematical grams findserrors that andincreases coverage code during test generationSyDr isanautomatic for tool complex pro Companies auditing or certifying software. orcertifying auditing Companies developingCompanies highlyreliable andsecure software. upSMTsolver work forand speeds generated queries. predicate. Thisfeature problemundertainting solves the of branch inverted) conditional being encing the frompath the slicing.SyDrremovesFormula excessive formulae (notinflu jumpsareand computed detected; Inverting indirect branches (inswitch statements). Table jumps programs; execution multithreaded Symbolic of dereference, buffer overflow); input datafor errors the found (for divisionby zero, nullpointer Safety are predicates to that used finderrors andto generate arguments); environment variables, standard inputstream, line command various inputdatasources (files, networkSupporting sockets, oninputdata. depend that instructions in reaching agiven a trace program of orcollecting point Automating reverse anexpert suchasassisting engineering constants; with oncomparisons depending branches the ISP Fuzzerintegration behind to reach code the ed; Inverting inparallel oninputdata. depend isalsosupport that branchesGiven inverting allconditional anexecution path, QemuSystemArm. RIO, QemuUserMode, GCCorLLVM instrumentation, static various suchasDynamo tools instrumentation Supporting RadamsaUsing the fuzzerfor new inputdatageneration; severity forcrashesEstimating the found; work;effective for fuzzerprocesses more the inputdatabetween Distributing ------25 Catalogue of Technologies ISP Crusher workflow

ИСП FUZZER QemuSystemArm

PeachPit Linux LL VM static instrumentation Mutation plugins Additional analysis GCC static module SyDr instrumentation Classifier Valgrind Input data Filtering Unique report Anxiety corpus QemuUserMode Crashes DrMemory report PostProcessing Linux/ QASan DymamoRIO Crash Windows report Radamsa Svace report Transport Instrumentation Dictionaries New input Instrumentation module Data Analyzer Lighthouse coverage Grammar input Severity generator New input with new Intut data Intut planner Intut queue estimation coverage

SYDR

Symbolic Executor Input Events (Symex) Spec. ReadSymbolicInput Symbolic Input Manager WriteSymbolicInput Input Symbolic Concrete Executor Instruction (Conex) Selector

Input Binary Symbolic Engine Derector

Indirect Control Instruction Flow Transfer DBI Resolver

New Inputs Exit Path predicate Slicing

New input Solver generator

26 advantages Features and RETRIEVAL SYSTEM STRUCTURE DEDOC:­­ provides: Docreader provides: Dedoc

— — — — — — — — — — tracted automatically. inputdatatype. Metadataandtextof formatting isalsoex structure extractionDocument regardless isfullyautomatic JPG etc.), archives (ZIP, etc.), RAR PDFandHTMLformats. Docreader pluginspackage for (PNG, images working with extended the canbe viapluginsandcontains Dedoc JSON). ODT,tured dataformats (DOC/DOCX, XLS/XLSX, CSV, TXT, andworks semi-struc with isimplemented inPython Dedoc analysis systems asaseparate module. integrated canbe contents andstructure indocument Dedoc represented asatree storing any headings level. andlistsof ture, Thedocument’s andmetadata. itstables contentsare to format. aunified Itextracts adocument’s struc logical universal isanopen system for Dedoc converting documents A DOCUMENT extracted from images. document features of archical classification structure onthe based andextracting orientation adocument itshier detecting imagepreprocessing methods. with together Google, contouranalysis, tableorientation. help of detecting plex having tables explicit multipage the borders with layer. domains. allowingreports, papers) flexible scientific tuningfor new documents, technical legal work, origin (statements of format. font type, size, styleetc.). having dfiferentdocuments formats. formats andto anoutputdataformat. aneasy changeof Utilizing modern machinelearning approaches modern for Utilizing Using Tesseract, enginefrom anactively developed OCR aphysical structureRecognizing text andacell for com atext orwithout with Working either PDFdocuments with various Working of images document scanned with Extracting tabledata from anXML-style DOC/DOCX lists. document for mistyped rules Adding correcting Extracting various text formattingfeatures (indentation, for extractingSupport structure nested document outof Extensibility new dueto document a flexible of addition - - - - - 27 Catalogue of Technologies Who is Dedoc — Developers of document contents analysis and man- target agement systems. audience? — Developers of intellectual text analysis algorithms. — Developers of automatic document processing systems.

SUPPORTED Russian and English. LANGUAGES

Dedoc Document structure workflow extractor Manager Table extractor

Docreader

API html-document

Format converter pdf-document with text layer Document attachment analysis pdf-document without text layer, images

Dedoc Converting to output format docx-document

excel-document

Recognized document txt-document structure

csv-document

28 advantages Features and platform a digital twin DigiTEF: blocks: of two main consists DigiTEF means: DigiTEF

— — 2. — — — 1. — — — — ject developersject hasformed around DigiTEFplatform. the engineers, of researchers,The community andindustrialpro sameaccuracy. the with costs tional CCM+ showed lower) similar(andinsomecases computa accuracy evaluations ANSYS compared with FluentandStar worldwide.its competitors DigiTEFcore and performance The platform delivers samelevel as userexperience the of ­Programs (No. 5377). es. ­ industrialdevic of digitalmodels highly sophisticated Itistailored forand acoustics. creating andworking with problems gasdynamics, aerodynamics, of hydrodynamics, developed atISPRAS. DigiTEF solves various application source tools, andlibrarieser open aswell asuniquemodules DigiTEF isasoftware platform onOpenFOAM based andoth (density, velocity, temperature, andothers); quantities ing procedure to maingas-dynamic determine the averag allowing spatio-temporal (QGD) equations, to usethe transformations);Hilbert (FFT, methods dimension usingdataprocessing POD, DMD, reduced to analyzeresults of designed andbuild models softwareCodeAster systems into DigiTEF. aswell cases asintegratingcalculation Salome, ParaView, and following the components: of consists athttps://github.com/unicfdlab. obtained can be OpenDTEF third-partylibrariesset of and inC++.Itiscompletelyopen basicalgorithms, procedures,the andfunctions,aswell asa requirements. specific the allow integrated research technicalobjects; of algorithms); ed Compressible flows simulation based on quasi-gas dynamics dynamics onquasi-gas based Compressible flows simulation Data analysis for visualizingandretrieving It is information. developed atISPRAS: Modules Thisallows automating parameterization on Python. based setupfor onswak4Foam; settings advanced based cases for compressibletools flows; modeling OpenDTE, platform the coreonOpenFOAM. based Itcontains possibility of developing additional components according to developing components of additional possibility integration andmodel that for tools automation computation development asinOpenFOAM+; pace (allows to source control code andtoopen adaptimplement DigiTEF is included in the Unified Register of Russian Register Unified of inthe DigiTEF isincluded

------29 Catalogue of Technologies — Incompressible flows simulation based on QHD equations. The module is applicable in oceanology, convection, and sub- sonic flows problems; — Incompressible and compressible flows simulation based on the Pimple and Kurganov-Tadmore hybrid algorithm; — Subsonic turbulent flows simulation using the hybrid URANS / LES approach and low dissipative numerical schemes; — Acoustic analysis. The module implements the Curle and Focs Williams-Hawkings analogies; — Simulation of ice dynamics.

Who are DigiTEF is designed for use in the facility of resource-intensive DigiTEF users? industries. Using digital twin models allows increasing engi- neering efficiency as well as reducing costs and complexity of the industrial projects implementation.

Deployment DigiTEF is used in several projects in the fields of wind energy, stories aerospace, aviation, metallurgy, as well as in the oil and gas industry. DigiTEF open-source modules are successfully used in Institut Pprime (France), Korea Atomic Energy Research Insti- tute (Korea), Universität der Bundeswehr München (Germany), Northwestern Polytechnical University (China), Embry-Riddle University (USA), California Institute of Technology (USA), etc.

System Linux OS. Other operating systems that support the Oracle requirements VirtualBox virtual machine may also be used (on 10 via the Bash shell). Moreover, the performance loss due to virtualization does not exceed 5%.

Required RAM: 16 Gb or higher.

DigiTEF supports parallel computing, which significantly speeds up its work. Also it supports high performance com- puting systems (supercomputers and clusters) to accelerate calculations. Using up to 1536 computational cores was tested.

Workflow

Setting Setting of a unique Input data of a standard problem task

SALOME … PyFoam BEM++ QGD module Unique PANS/ module LES AMReX HCS module Swak- Digital model 4Foam Code_Aster CAA module Specialized digital model

As a platform for As a modeling integration tool Digital Test Facility

30 advantages Features and FRAMEWORK VERIFICATION KLEVER: audience? target Who isKlever provides: Klever etfcto laboratories. Certification — — — — — — — — source project (https://forge.ispras.ru/projects/klever). Ccode. Klever the isanopen- of lines ormillionsof thousands approach for software hundreds verificationof systems of with fromed source code. Theframework implementsamodular non-interactivefield of for extract checking models model Klever research scientific isaresult of and development inthe various of andsafety security tion requirements. programming language. Klever performs automated verifica large of softwaresource systems C code developed inthe Klever isaframework extracted for models checking from the cal software. programs. environments target specific of including modeling program requirements of violations specific detecting setsforcustomer Developmentspecification needs. of verification results.of andforrunning verification processes expert analysis arguments required to reproduce them. expressions of es andvalues variables of andfunction to framework faulttion the locations provides sequenc programs. execution andsymbolic for checking as model large mostrigorousing the program such analysis methods, gram APIs). specific ing memorysafety for pro andrules correct usageof stated assumptions). itly programsand proving of correctness the underexplic requirements specified of violations ing allpossible Companies developingCompanies andsecurity-criti safety-critical software the verificationframework of Adaptation to A convenient web-interface multi-user for and setting Comprehensive found representation of faults. Inaddi Scalability. program Modular verification allows apply anextendableChecking requirements suite of (check Thorough industrialsoftware sound analysis of (reveal A SOFTWARE ------31 Catalogue of Technologies Klever Klever was developed by the Linux Verification Center (http:// deployment linuxtesting.org/) that was created with support from the Linux stories Foundation and hosted by ISP RAS. Currently the framework is mostly used for verification of various operating systems.

To showcase Klever features, it was used for verification of Linux kernel device drivers. The number of faults found by Klever and acknowledged by Linux kernel developers exceeds 350. These faults include buffer overrun errors, null pointer dereference, uninitialized memory use, double or incorrect memory deallocation, race conditions and deadlocks, leaks of specific Linux kernel resources, incorrect function calls in a certain context, incorrect initialization of specific Linux kernel data structures.

System Ubuntu 18.04, at least 4 CPU cores, 16 GB of memory, 100 GB Requirements of disk space.

Workflow

Framework adaptation to the target software system

Configuration and start of verification processes

Automatic verification

Expert analysis of verification results

32 LANGUAGES ENDANGERED DOCUMENTING LABORATORY FOR A VIRTUAL LINGVODOC: provides: Lingvodoc Advantages Features and — — — — — — — — — — https://github.com/ispras/lingvodoc-react). an innovative research (https://github.com/ispras/lingvodoc, source cross-platform isanopen Lingvodoc system on based ­lingvodoc.ispras.ru. der activedevelopment found 2012andcanbe on since andTomskof Sciences University. State is un Lingvodoc RussianAcademy the of Linguistics of Institute the with received soundandtext the Itisajointprojectwith data. work andperformingmulti-layered scientific dictionaries endangered languages, of for creatingdocumentation isasystemLingvodoc intended for collaborative multi-user supported; structures isalso dictionary multi-layer Importing dictionaries. lexical with entry layerdictionaries andparadigms layer or analysis; andphonetic abilityto etymological ject): carryoutautomatic isoglosses; similarTypeCraft to the posed project); formants followed datavisualization; with WAV, MP3and FLAC formats), vowel aswell asconstructing dictionaries; between external connections aswell as lexical dictionaries between within entries tions erlands); Psycholinguisticsdeveloped by of MaxPlanckInstitute (Neth ELANsystem integration the onthe with based taneously feature); this doesn’t project support that Starling Creating dictionaries of any structure, of Creating dictionaries suchastypicaltwo-layer levelHigh automation similar Kielipankkipro (compared to the Conflict-free bilateral delayed synchronization; of Ability to search construction automatic dataonamapwith parameters multiple Advanced search supports that (asop Recording, playing markup(in andstoring sounds with connec andbidirectional unidirectional Creating and editing Working simul audio-textual anddictionaries corpuses with useractions; Saving fullhistory of similar to the Collaborative (asopposed work ondictionaries ------33 Catalogue of Technologies — Using either ISP RAS cloud infrastructure resources or locally deployed resources with data isolation; — Desktop and web-based versions; — Open registration (confirmation required); — Fast development for extending the system features as well ­as easy adaptation to another scientific field.

Who is Lingvodoc is designed primarily for linguists performing a re- Lingvodoc search in the area of documenting the endangered languag- target es in Russia. However, it is possible to adapt the technology audience? for other purposes.

Lingvodoc Lingvodoc is currently used by philologists in 29 universities deployment and scientific centers of 16 cities, including Tomsk State Uni- stories versity, Institute of Philology (Siberian Branch of RAS), Institute of history, language and literature (Ufa scientific center of RAS), Udmurt Federal Research Center UB RAS, North-Eastern Federal University, Ugra State University, Institute of Linguis- tics, Literature and History (Karelian Research Centre of RAS), Murmansk Arctic State University.

Lingvodoc workflow

GraphQL HTTP Language Protocol Lingvodoc Frontend Browser expert web-interface

React

Apollo

JavaScript Redux JavaScript Python Semantic-Ui

Ruby Wavesurfer

C ++ Leaflet

GraphQL HTTP Lua Protocol Lingvodoc backend C# Pyramid Apple Swift Programmer Celery Java Dogpile Scala Python 3.5 + Graphene Any language with HTTP support SQLAlchemy

Using bash C extensions and curl

Using browser add-ons (such as Altair) 34 advantages Features and SOFTWARE SYSTEMS HIGHLY RELIABLE FOR DESIGNING MASIW: provides: MASIW

oe synthesis: Model — analysis: Model — — verification, static,verification, anddynamicanalysis. areas opment, MASIWcurrently ismore inthe of functional OSATE the presence of the Despite devel startof atthe tool “GosNIIAS”. with and defects. MASIWisdeveloped jointly faulting the tolerance errors analysis. riskof the Thisreduces first prototype,product before makingthe aswell as perform It allows performing the apreliminary of qualityassessment complex hardware/software of verification process systems. development for the and technology optimizing MASIW isthe areas.easily adapted for application other integrated avionics (IMA)approach. MASIWcanbe modular hardware/software are systems that developed usingthe areas.critical for Itisdesigned engineerscreating airborne and software systems for avionics, medicine, safety andother forMASIW isatoolset developing highlyreliable hardware — — — — — — — — — — — language: modeling AADL Creation, editing and management of models based on the onthe based models andmanagementof editing Creation, SUPPORT processor schedule generation schedule processor (inparticular, for hardware+software of simulation user system with model failures analysis of based architecture-model andtheir generation fault andanalysis of trees (FTA) to determine transmission characteristics analysis for AFDX the net developed system the for with verification of compliance hardware+software system structure analysis: hardware reuse. models for third-party AADL support the forsupport team abilityto development track the with and text usingthe and diagram models of creation andediting ARINC-653 compatible real-timeARINC-653 compatible operating systems). regarding restrictions additional reliability andsecurity; ules, hardware takinginto account resource and limitations QEMU. RTOS with on-board with partitions of tion co-emulated execu reports generation includingsoftware-in-the-loop tive tables; descrip includinggeneration special consequences, of high-level of probabilities fault events; etc.;works: latencies, message queuedepth, port requirements;the resources sufficiency, interfaces consistency, etc.; amodel; individualelementsof modify editors; distribution of software applications by computational mod software by of applications computational distribution ------35 Catalogue of Technologies — Configuration data generation: — development of specialized configuration data tools based on the provided software interface (API); — configuration data generation for the VxWorks653 RTOS and for the AFDX network equipment. — The ability to extend the toolset by creating own modules.

MASIW Workflow

Model The software-hardware refinement Results of model analysis system in the form (AFDX Network Analyzers, of AADL-models IMA System Analyzers, (repositories, git, AADL- PyCL Checker etc) model libraries etc)

MASIW

Data from hardware or software vendors in the Configuration data for form of configuration software-hardware files, AADL-models and system (JetOS, the requirements for VxWorks653 etc) Requirements them refinement

36 advantages Features and Generator A TestProgram MICROTESK: provides: MicroTESK — — — at http://www.microtesk.org. pras.ru/projects/microtesk. isalsopresented Thetechnology It isfree website: for ISPRAS https://forge.is download onthe license. Apache2.0Also, open-source itisdistributed underthe increased with usabilityandperformance.outperforms them (e.g.,lar to globalcompetitors Genesys Pro andRAVEN) but templates). MicroTESK delivers value issimi users to that the generation framework (buildingtest programs ontest based andthe microprocessors onformal based specifications) framework modeling the includes of (buildingmodels that solution production MicroTESK is a state-of-the-art onlinetest programsupports generation. tures ranging from CISC/DSP to andVLIW. RISC MicroTESK tors automatically. MicroTESK avariety architec supports of tures, MicroTESK test program allows constructing genera microprocessor architec of onformalBased specifications test programs microprocessors. for verification of functional MicroTESK isanindustry-targeted framework for generating — — — — — templates: — — — microprocessor underverification: the Wide range microprocessor architectures: supported of Test programs generation test onobject-oriented based knowledge about asasource of Using formal specification

MicroTESK-based test program generators have been many architecture of featuressupport specific (RISC, generation framework scalability (abilityto develop com allows using different instruc generation of techniques tem test Ruby the templates inthe language (sothat an option to makean option atransition to formal verification andto inthe memorysubsystem specifications additional nMLlanguage(registers, inthe architecture specification CISC, VLIW, DSP); plex test templates atlow dueto cost reuse). generation, etc.); generation, combinatorial generation, constrained-based (random andtest datasimultaneously sequences tion plates are humanreadable andeasy to support); under development (disassembler, emulator, etc.). generation toolchain an automatic for amicroprocessor L2), address translation logic, read/write operations logic); mmuSL language(memorybuffer(TLB, properties L1,and representation); instruction memory, logic, addressing modes, instruction text/binary

------37 Catalogue of Technologies developed for such architectures as RISC-V, ARM, MIPS, PowerPC; — multicore architectures are supported. — Fast adjustment to a new microprocessor architecture with minimal costs and automatic extraction of information about test situations (due to formal specifications); — Convenient language for developing test templates that allows describing complex verification scenarios quickly; — Support for online test program generation: post-silicon verification of the target microprocessor. The executable generator, included into MicroTESK, performs the online generation. It uses text and binary instruction representa- tions extracted from the formal specifications. Correctness checking is based on the multiple executions of the sets of test sequences: the original ones together with the ones with functionally equivalent substitutions.

System Windows or GNU/Linux-based OS, Java 8. requirements

MicroTESK MicroTESK is developed since 2007. It was used in various deployment Russian and international projects on developing modern stories industrial microprocessors, including production projects on verifying ARMv8, MIPS64, and RISC-V microprocessors.

Workflow MicroTESK: Test Program Generator

Model Simulator Translator Constraints

Verification engineer Core Generator Test templates Extensions

Fig. 1. Traditional (­pre-silicon) test program Test programs generation flow

MicroTESK’s ISA in nML Test Templates Meta Generator

Runnable Sets of Test Cases Generator

Closed Binary Exerciser Code

Fig. 2. Post-silicon test Comparator program generation and 38 execution flow provides: ISP Obfuscator advantages Features and ISP OBFUSCATOR — — — — — — — — — — — — ods andallows awholeOS compiling distributive.ods vulnerabilities usingvarious of diversification meth tion code protectsObfuscator asoftware system from massexploita made. will ­ rest software hascertain the installed, that devices the of backdoors. Incaseahacker attackingone iscapableof vulnerabilities fromexploitation resulting of errors or to prevent technologies isasetof Obfuscator mass (including the ASLRsystem mechanism). (including the linker.to the duringbuildtime, compiler asopposed the within performed and more flexibly required customized, asthe operations are usecase); things for Internet of important the — Browser); developed for Tor Selfrando(compared to technology the the software kernel, antivirus the and avoiding the with conflicts code); Pagerando randomize that technologies onlylarge of blocks — traditional the methods. of SPEC CPU2006test suite, lower that whichisnoticeably than average showscompiler the 2%onthe slowdown about of ret-to-libc, GCC etc.). the within Theimplemented CFIsupport reuse counteractscessfully attacks code (ROP, mostof JOP, wholeOS sourcethe code. efforts for integration itsbuildsystem are with required). maximumis8 times. the times, degradation is1.2 speed (when protecting againstreverse Theminimum engineering). Conflict-free combination with other software other protection tools Conflict-free with combination performs alsoControl Flow Integrity(CFI). an extended diversifying setof transformations applied canbe degradationperformance iscloseto zero; notincrease sizedoes (which isparticularly binarycode the except whole OS code throughout the Shuffling functions granularity to function ASLRand Shuffling with (asopposed Two diversification approaches: The originalcontrol flow integrity technique(CFI),whichsuc GCCcompiler, onthe Based building whichallows correctly program to (nochanges the Full automation or source code level andperformance obfuscation of balance the Fine-tuning remain protected by changes to the code that the tool tool remain the protected that by code to changes the Static code diversification. During diversification. each separate code Static compila diversificationDynamic atprogram code startup. Itisused file iscreated. Thisapproach following hasthe advantages: key, specified onthe depending tion, new executable the advantages over similarproducts: the 1.5%. about provides of Theobfuscator followingtion the aslightincreasewith degrada insizeandaperformance allowscode shufflingupto 98%of procedure). Thismethod (foron alldevices example, certification the of because deployed samebinarycode the customer needs when the - - - - - 39 Catalogue of Technologies Who is — Developers of specialized operating systems; Obfuscator — Application software developers. target audience?

Obfuscator ISP Obfuscator is deployed in the Zirkon OS, which is used by deployment the Ministry of Foreign Affairs and the Border Guard Service stories of the Federal Security Service of Russia.

System Obfuscator is a universal product that can be adapted to requirements many system requirements. The production version is cur- rently running on a Linux-based OS (version 2.6 and higher) with the Intel x86 / x86-64 architecture support.

Obfuscator workflow

Standard compilation

Executable Source code GCC compiler and linker code

Hacked

Exploit Errors Errors

Errors Static diversification Executable Seed 1 code 1 Hacked

Static diversifying Executable Source code GCC Exploit code 2 and Linker Seed 2 Not Hacked

Executable Seed 3 code 3 Not Hacked Errors

Dynamic diversification Run of executable code 1 Not Hacked

Executable Modified GCC compiler and code diversifying Run of executable Exploit modified linker Data for dynamic code 2 diversification loader

Run of executable code 3

Source code 40 Errors advantages Features and ANALYZER A NETWORK TRAFFIC PROTOSPHERE: provides: Protoshpere — — — — — — el that enables rapid enables el that expansion analysis capabilities. of auniversal Analyzer)with Message datarepresentation mod keythe features (e.g. similartools of Microsoft Wireshark, research area network inthe traffic of analysis. Itcombines Protosphere isaninnovative innovative system onthe based internal representation. flexibility its dueto orclosed) the of for open new protocols (either actualtraffic. andthe specification Itallows quickly to addsupport tems. Protosphere aprotocol between detects inconsistencies serve intrusionandinformation leak protectionsys asapartof Protosphere (DPI).Itcan packet deep isasystem of inspection — — — analysispresent results; the — — — — — — — — stages): are that synchronized between component has avisualization — — — — — Universal datarepresentationto accelerate model customization: mostconvenientAdvanced GUIprovides the way of choice to analysis modes; onlineandoffline for both Support for support new protocols: Easy network for trace of allstages Support analysis (each stage Advanced system core: configuring the analysis results the format.configuring extract datainadesired format; forsupport new protocols; developed on real-time being module trafficdebugging the parsing errors localization; to parsing resultsaccess viaAPI; arbitrary OSI-layer dataextraction and analysis (L7+). discrepancies aprotocol implemen between of detection interactive parsednetwork packets the of visualization timeline inthe connections selected the view of detailed network interaction inthe localization network connections forsupport network flows causality. for arbitrarysupport tunnelsof configuration compressed/encrypted dataanalysis; reordered corrupted, of processing orduplicated packets; universal whenparsing datarepresentation used model and network traces. actualtraffic log; andthe diagnostic tation inthe streamin the tree; diagram; graph network flow andthe tree; packet asymmetrictraffic;handling of of loss;processing network traffic; - - - 41 Catalogue of Technologies — Adjustment to network bandwidth and available computa- tional resources to find a balance between accuracy of the analysis and the resources consumed.

Who is — Companies that are testing network protocol implementa- Protosphere tions including those in embedded OS and network hardware. target — Developers of network security tools, such as firewalls and audience? IDS/IPS. — Manufacturers of network hardware that must be certified. — Companies requiring real-time control and monitoring of network channels.

SUPPORTED Architecture: Intel x86-64. PLATFORMS AND Platforms: Windows, Linux kernel based OS. ARCHITECTURES

Protosphere Workflow Data extraction modules

statistics on network file extractor protocols

LAN monitoring DLP API

Third-party tools

Network traffic analysis — Online modules — Online core Protosphere source code Parse errors support for new Modules: protocols — recognizers — parsers Network traces

Core — module management — access to parsing results Network traces analysis — parser failure diagnostics —GUI Interactive offline — offline modules analysis — offline core

42 advantages Features and BASED ON ANALYSIS PLATFORM ISP RASSOFTWARE provides: Platform Foundation ISP RASQEMU — — github.com/ispras/qdt, https://github.com/ispras/i3s. pras/swat. are Thedeveloped QEMUtools available athttps:// available GitHubpage:https://github.com/is ISPRAS onthe reverse is QEMU with support debuggingandintrospection makes malware analysis more secure. analysis tools, includingCoveritycode andSvace, sothis kernel. isregularly checked TheQEMUsource by code static bugging low-level andanOS software suchasabootloader allows de that mode fullsystemQEMU supports emulation IDAProtocol clientscanbe (the Pro, GDB, IDE,etc). Eclipse PC, etc.), aswell debuggingviaGDBRemote asguest Serial architectures (i386, AMD64, ARMandThumb, MIPS, Power set more 10instruction of than emulation QEMU supports for debugginglow-level software. features,introspection aswell mode asfullsystem emulation platform development. reverse Itsupports debuggingand framework for organizingplatform isessential multi andcross sourcesystems QEMUemulator. open isbuiltontop of This Platform for Foundation creating programISP RAS analysis — — — installing monitors: — — — — VM introspection solution without any guest modifications or any modifications guest without solution VM introspection A record andreplay mechanism: WinDbg server support in QEMU that allows showing server inQEMUthat WinDbg support allLinux-basedSupports aswell virtualmachineimages Recovering following the OS-level information: system Low overhead performance by caused recording. This The minimumrequired informationisrecorded. Thisal reverse debuggingisimplemented GDB-compatible eventsOne canrecord once, non-deterministic then guest softwareguest Windows information interms of kernel software for images as embedded various devices; ules; mod andloaded files open runningprocesses, listof of inshared libraries, to functions calls, named accesses list external environment interaction. software requires real analysis enables of that the time errors; lows one to record longerfor debuggingrarely occurring are totion used reconstruct previous states; abilityto replay deterministically shots andthe VMexecu record onthe andreplay VMsnap based mechanism. deadlocks) easier; (raceing bugsinmulti-threaded applications conditions, same VMexecution isreplayed every time. Itmakes find replay recording the deterministically, many The times. QEMU ------43 Catalogue of Technologies abstractions. There is no need to enable kernel debug- ging mode in the guest OS. — Speeding up QEMU development: — Faster development of dynamic analysis tools that can analyze binary code for specific hardware; — Automated support for new processor architectures using a machine instructions decoder generator and a C-like language for describing machine instructions semantics; — An automatic tool for preliminary testing of a virtual machine. The tool only requires GNU Binutils and the C compiler; — A tool for automating QEMU virtual devices development; — VM generation tool in the form of a QEMU module source code. The tool can create VMs from both existing devices and new devices. The tool utilizes a GUI for a virtual machine schema and a Python description for the new machine; — A Python API for an automated debugging via GDB Re- mote Serial Protocol. It is used to debug QEMU, the guest OS, or both at the same time — Convenience and user experience: — Easy QEMU extension due to open source code and ISPRAS toolkit for speeding up development; — Binary code analysis without any guest OS modifications; — VM introspection mechanism that can be extended using plugins; — A convenient API for developing own introspection plugins; — Can be easily adapted for specific use cases; — Support for latest QEMU versions that have support for newest peripherals and CPUs.

Who is ISP RAS — Bootloader, driver, OS and other system software developers; Foundation — DevOps teams for reproducing of software bugs, cross-­ Platform target platform development, and scalable cloud testing; audience? — Programmers analyzing potential malware. — Software certification engineers.

Supported guest Emulation of the following ISAs: i386, x86-64, ARM, MIPS, platforms PowerPC, and others. Guest systems supported by the introspection mechanism: Windows XP (x86), Windows 10 (x86-64), Linux 2.x-4.x (x86, x86-64, ARM, AArch64).

ISP RAS QEMU The QEMU community has accepted ISP RAS patches for the deployment record and replay mechanism and added them to the open stories source QEMU version 3.1.

44 Workflow software Target QDT l3S + Events High-level information Instructions CPU devices Extensions Plugins QEMU and replay Processes Record +

IDA Pro, WinDbg GDB [+Eclipse] Debuggers and IDEs 45 Catalogue of Technologies RETRASCOPE: STATIC ANALYSIS OF HDL DESCRIPTIONS

Retrascope is a functional verification toolkit for digital hardware designs. Retrascope provides automated engines for code analysis, formal model extraction and functional test generation. The toolkit accepts digital hardware module descriptions as inputs, written on the synthesizable subset of Verilog and VHDL languages, as well as their behavioral specifications.

Features and Retrascope is an open source toolkit for functional verifica- advantages tion of digital hardware designs. The toolkit provides several methods for formal model extraction and analysis and functional test generation. A component-based architecture of Retrascope allows user to develop hybrid techniques of formal model analysis by freely combining various analysis methods. The toolkit’s source code and distributive are avail- able at forge.ispras.ru/projects/retrascope.

Retrascope — Formal models extraction from source code: provides: — control flow graph; — guarded actions decision diagram; — high-level decision diagram; — extended finite state machine. — Functional test generation: — random tests; — dead code detection; — typical read-write error detection; — user-defined property checking. — Formal models analysis (model checking) on specifications conformance via: — PSL; — SystemVerilog Assertions. — Graphical user interface based on Eclipse IDE (and command line interface too): — running the tool with specified parameters; — model visualization (Zest, GraphML). — Open source (Apache License Version 2.0); — Extensibility at the source code level: — new models; — new engines for analysis and test generation.

46 workflow Retrascope requirements System stories deployment Retrascope audience? target Retrascope Who is (VHDL, Verilog) (VHDL, Specifications Specifications Parameters (PSL, SVA)(PSL, HDL

— — — Environment 8. WindowsSoftware: orGNU/Linux-based OS, Java Runtime development. Retrascope research isatthe active prototype stagewith — — — and verification: Research digitalhardware groups fieldof inthe verification. area working digitalhardware inthe Companies of design APIsandformatsOpen allow usingvarious for tools analysis Functional tests viaVerilog/VHDLFunctional andthe languages checkers SMVlanguage.Model viathe SMT solvers SMT-LIB viathe v2language; ­VCD format. Transformers Engine Extractors representation Intermediate 0 Engine Functional tests Functional Retrascope Launchers Toolchain Engines Printers 1 Models GADD HLDD EFSM Generators Simulators Engine n 47 Catalogue of Technologies SCINOON: EXPLORATORY SEARCH SYSTEM FOR SCIENTIFIC GROUPS

SciNoon is a system for collaborative exploration of scien- tific papers. It is essential for a group of researchers to dive quickly into the new area of knowledge and to find answers on their questions, following up with tracking new research on the topic of interest with highly customizable alerts.

Features and SciNoon is an innovative system designed to optimize long- advantages term teamwork with scientific papers. The papers could be added both from search systems and from digital libraries (like Google Scholar, arxiv.org, Semantic Scholar, PubMed) or be uploaded directly as PDF files. The key feature of SciNoon is graphical research maps that all team members can add papers to.

SciNoon — Shared workspace for collaborative processing of found scien- provides: tific papers. — Zooming research maps to control paper visualization details. — Deduplication and cleansing of the uploaded metadata made possible by an internal database that allows discovering con- nections between papers or authors. — Citation contexts classification into five classes depending on a citation role: — Background: a cited paper contains general information about the research area; — Use: a citing paper uses methods, data, and so on from the cited paper; — Compare: a citing paper points differences (or similarities) with the cited paper; — Extend: a citing paper continues the research from the cited paper; — Weak: a citing paper criticizes the cited one pointing to the authors’ mistakes; — Possibility to find relevant papers without keywords search using an integrated recommendation system. — A customizable list of questions whose answers should be looked up in a paper. Based on the retrieved answers the paper is visualized differently on the research map. — Possibility to group similar papers into clusters. — Notifications for team member actions, as well as support

48 workflow SciNoon stories deployment SciNoon audience? target Who isSciNoon PDF and/or metadata paper’s

Data acquisition Data acquisition processing subsystem service PDF — — — — —

advising students. whiledoingresearch inISPRAS isused andwhen SciNoon tory search onresearch projects. problem. scientific their previously results. gathered processing. to export datatoand anoption aCSV filefor more complex other. for quickopinionexchange team the andhelpingeach within Scientific advisers and their studentswhoare advisers andtheir doingexploraScientific for a tool collaborative whoneed teamwork. Scientists R&D departmentresearchers afast need to that solution Tracking new onagiven papers research topic andupdating answersCollected analysis viabuilt-in spreadsheet view HTTP API HTTP database Graph

Research map Web interface 2010 2019 2015 extended - 49 Catalogue of Technologies SVACE STATIC ANALYZER

Svace is an essential tool of the secure software develop- ment life cycle, the main static analyzer that is used in Sam- sung Corp. It detects more than 50 critical error types as well as hundreds of coding issues. Svace supports C, C++, C#, and Java, with preview support of Kotlin and Go. Svace is in- cluded in the Unified Register of Russian Programs (No.4047).

Features and Svace is an innovative technology based on years of research advantages that constantly evolves for customer’s needs. It combines the key qualities of foreign competitors (Coverity Scan Static Analysis, Klocwork Static Code Analysis, Fortify Static Code Analyzer) with the unique open industrial compilers usage to provide the maximal support level for new programming language standards.

Svace — High-quality deep analysis: provides: — an accurate representation of the source code (due to integration with any build system); — full path coverage taking into account function calling contexts when searching for complex defects; — high percentage of true positives (60-90%). — Scalability and high speed: — parallel analysis using up to 64 processor cores; — ability to analyze software with the code size of tens of millions of lines (analysis of the Android 6 OS having ­8 million lines of code takes about 5 hours); — supporting incremental system analysis in addition to the full analysis mode (performs a quick re-analysis of the recently changed source files). — Convenient warnings viewing interface: — detailed error description with code navigation; — review interface for marking true and false positives; — analysis results migration between runs with hiding any issues previously marked as false positives. — Accelerated customization (configuring existing detectors as well as writing individual ones available exclusively to this customer; creating tailored user interfaces); — Ultra fast adaptation to new environments and tools (adding new compilers within 1-2 weeks, in complex cases up to 2 months); — Full compatibility with regulatory documents and require- ments of regulators (FSTEC of the Russian Federation); — Can be used for adhering to the GOST R 56939-2016 re- quirements and to the requirements of the FSTEC regula- tion document mandating software vulnerability detection process (when certifying software within Russia).

50 architectures platforms and Supported stories deployment Svace audience? Svace target What is Compilers Supported — — — — — For Go:Go1.14.For Kotlin: KotlinFor 1.4. ECJ compiler. JavaFor (upto Javac Java OpenJDK Compiler, 11): Eclipse С#(upto C#7):Roslyn, Mono.For Compiler.Optimizing Tools,HINE16 Compilation Texas InstrumentsTMS320C6* for­C compiler Toshiba TLCS-870 Family, CalmS Samsung and R8CFamily, Panasonic CCompiler, MN10300Series C Compilers, C/C++ for Renesas Compiler M16CSeries the Wind River DiabCompiler, NEC/Renesas CA850, CC78K0(R) alView/ARM Tools Compilation (ARMCC),Intel C++Compiler, Clang (LLVM Microsoft Visual compiler), C++Compiler, Re С/С++ For (up to Collection), C++17):GCC(GNUCompiler gres Code, Professional, andSwemel. Security labs, includingRusBITech,and certification Kaspersky, Post Svace Russia, isdeployedWithin in more 30companies than submittedforchanges review TizenOS. andinclusioninthe 2017, homeappliances. Since Samsung Svace checks all insmartphones,­Tizen isused infotainment systems and onAndroidbased OS TizenOS aswell source asthe code. 2015.since company’s to the Itisused check own software Corp. inSamsung analyzerused Svace mainstatic isthe the GCC compiler are supported). GCCcompiler partially the 32/64, Hexagon (AEON,TriCore, HIDSP, targets of OpenRISC ARM/ARM64, MIPS/MIPS64, Power PC/Power PC64, RISC-V and 2. sion 3.13 andlater), Windows 7SP1andlater, includingWSL1 focus onhighreliability andsecurity; Target Intel x86/x86-64, code: analyzed architectures the of Host platforms for analyzer:Linuxkernel the OS based (ver laboratories.Certification developed software; the to certify need that Companies atsoftware aimed developmentCompanies aspecial with - - - - 51 Catalogue of Technologies Svace Architecture

C/C++ Java/Kotlin IR Svace Go

Interception Warnings

C# C# analysis

Build system — syntax coloring and code navigation support; — warning review support (assigning true / false positive status); — comparison of analysis runs with The analysis — lightweight abstract suppressing old false positives. intermediate syntax trees analysis; representation is — interprocedural created by own analysis (context compliers adapted sensitive and path from the open sensitive with industrial toolchains. symbolic execution); — tainted data analysis.

52 Features and FRAMEWORK PROCESSING TALISMAN: provides: Talisman advantages — — — — required for manualanalysis). results (reducing scientific resources up-to-date with cesses Itsadvantage routine analysis proAnalytics). isautomating (Palantircompetitors andIBMWatson Gotham Content extraction from text. Talisman iscomparable to world’s best structure retrieval, andTexterra, aplatform for semantic Dedoc, asystem technologies: for two ISPRAS document Talisman for Itbuildson bigdata. necessary tools the unifies Internet networks). sources (includingsocial datafromand work uniformly the private with and databases merge systems that analytical multi-user specialized ment of storage fast Itmakes the andvisualization. possible develop tasks,processing suchasdataretrieval, integration, analysis, Talisman automate that typicaldata tools setof isaunified tures to the interesting components without changingothers. without tures interesting components to the requiring userinteraction. — — — management andintegration: A scalablearchitecture allows andstoring processing that A flexible architecture modular allows addingnew fea that An easy to useweb interface allcomponents unifies that have that reusable APIsforA richsetof components easy storage andindexing includeanum These components. dataanalysis Analysis are components. tools automatic data retrieval They components. includeaframework data. store analysis results, automatic source data, anduser andinformation search databases of that engines ber system tools. andownact OCR ISPRAS Cassandra are Tesser used etc). the Thebasicservices ElasticSearch, on hard disks (PostgreSQL, ordatabases RussianProgramster of asNo.6045). Theoutputisstored Talisman.Flowthe Regis Unified system inthe (included are by asDocker containersthat APImanaged designed data from filestorages anddatabases. etc.developer portals Alsothere’s asystem for importing Youtube, LinkedIn etc.), blogs, news, MediaWikisites, VKontakte, Twitter,(Facebook, Instragram, Odnoklassniki, namely,for Internet datacollection, from media social A DATA DATA A ------53 Catalogue of Technologies more data just by adding more hardware without any software change. — Specialized components that monitor system status, manage event log, perform deployment, authentication and authoriza- tion, access control, and unidirectional data transfer. — Tools and methods for training machine learning models as well as for transferring existing algorithms to other knowl- edge domains; — A configurable knowledge domain scheme that may be changed by a user when working with the system. — An onsite deployment option using existing customer hard- ware or on the provided hardware. — Integration with private customer systems via provided com- ponent APIs. — Closed license free. Talisman is based on open source and know-how ISP RAS tools.

Talisman — Automated knowledge base construction for a given knowl- application edge domain and non-stop monitoring for new information areas regarding objects of interest. — Competitor intelligence based on open sources (OSINT). — Detecting information campaigns that aim to manipulate target audience as well as detecting the target audience for a campaign. — Detecting and analyzing means for disseminating information (used resources, people, bots) as well as analyzing commu- nity member communication roles (news source, opinion leader, disseminator, moderator, bot, commentator). — Reputation management for persons and companies, includ- ing monitoring relevant news, detecting possible complaints, monitoring leaks and information disclosure. — Staff management optimization including efficient recruit- ment, data verification, detecting hidden activity, assisting in developing motivational systems. — Evaluating activity effectiveness objectively and testing strat- egies on a target audience to gather feedback. — Finding and managing social tension to detect and prevent conflict escalation.

Supported Talisman supports languages recognized by the Texterra languages analyzer, namely, Russian and English.

54 workflow Talisman Extraction from Web (DBMS, FileSystem) Data Crawling and Data Import Unidirectional DataUnidirectional Transfer Subsystem Support Aplication ProgrammingAplication Interface (API) System Monitoring Health Deployment Subsystem of documents documents of the structure the Extractiong (Dedoc) Machine Learning Model ManagementSystemMachine Learning Model Streamig Data Processing (Talisman Stream) Processing Batch Data Processing (Textera) Text Processing Graph Processing Multimedia Authentication andAuthorization Authentication Access Control Access Web-Inerface Processed Data Event Log Raw and Store Storage andIndexing Search Indexes Components

Knowlege Base 55 Catalogue of Technologies TEXTERRA: A SEMANTIC ANALYZER

Texterra is a scalable platform for extracting semantics from text. It contains the complete fundamental set of technolo- gies for creating multifunctional applications for text analysis. Texterra bases its semantic analysis approach on concept identification. The platform is included in the Unified Register of Russian Programs (No.4048)

Features and Texterra performs a unique analysis of Russian texts based advantages on the identification of concepts instead of just words. It differs from foreign competitors by paying the most attention to Rus- sian language. The analyzer builds on fundamental research results and integrates with the Elasticsearch search system greatly expanding its capabilities. The successful combination of technologies allows the platform to compete with the pro- jects similar to IBM Watson Natural Language Understanding.

Texterra — High text processing speed (morphological analysis: ­69 000 provides words per second, syntactic analysis: 39 100 words per sec- ond, coreference resolution: 10 100 words per second, full text analysis: approximately 13 600 words per second); — Maximum attention to Russian language (unlike similar spaCy and UDPipe projects, as well as IBM Watson Natural Language Understanding, which does not support the analy- sis of emotions and concepts in Russian-language texts); — Large knowledge base (more than 7 million concepts); — Building the knowledge base without expert involvement (automatic construction and update using Wikipedia, ­MediaWiki, Linked Open Data, etc.); — Scalability both in word processing speed and knowledge base size (using Apache Ignite and the Asperitas cloud tech- nology developed at ISP RAS); — High text analysis accuracy due to a number of key features: — Multi-level search by related concepts; — Adaptability to slang, hashtags (#) and errors in text; — Emotion analysis (with separation of attitude towards objects and their attributes); — Determining relationships between people and compa- nies based on text information; — Detecting implicit object references in discussions. — Fast adaptation and tailored solutions development; — Support of two use cases: — As a deployed software system on a customer’s local server providing either HTTP REST-based or RMI proto- col access; — Online at texterra.ispras.ru. 56 requirements System languages Supported stories deployment Texterra audience? target Who isTexterra workflow Texterra morphological analysis with analysis with morphological language identification; analysis of syntaxand analysis of Linquistic analysis Linquistic error correction; semantics. module: — — — — — — — Texterra RussianandEnglishtexts. analyzes government Texterra. alsouses agencies frameworkas dataprocessing Talisman. Russian of Anumber Texterra backs upseveral innovative ISPRAS products such smartTVs).lyzing corporate Currently reports orsupporting project goalsweresung (the to develop for atechnology ana Texterra HPandSam jointprojects with inthe isproductized (such asinformation security, medicine, etc.); auditing, chine learning approach. to integrate new backed languages ma upby modern the 64-bit operating system isrecommended. ormore16 GbRAM for language; each supported A system platform by supported Java 8; arbitraryDevelopers of text applications. processing search semantic Developers systems of for domains certain Corporate software developers (e.g. developers); chatbot Simple andfast ability domainsandthe for support specific identification of mentioned mentioned of identification key recognition. concepts relationships extraction; Information extraction concepts; module:

taking slangandhashtags of social media users, media of social analyzes the opinions the analyzes Sentiment analysis Sentiment into account. module: module:

-

- - 57 Catalogue of Technologies TRAWL: A BINARY CODE ANALYSIS PLATFORM

Trawl is a unique production-level tool for analyzing various binary code features that supports multiple target processor architectures. It does not require debug information or source code. Trawl can be used to analyze all kinds of software rang- ing from boot loaders to user-level applications. It is included in the Unified Register of Russian Programs (No.5323).

Features and Trawl is a large software system based on decades of experi- advantages ence generated by compiler developers and information se- curity experts. Compared to similar research tools in the area of binary code analysis, Trawl is ready for production use.

Key features:

— full-system execution trace analysis where the entire soft- ware stack is evaluated, including system software; — unrestricted recovery of data and control flows on machine code level; — localization of individual algorithms in the code, formal pres- entation of their structure and semantics — identification of sensitive data leaks in memory based on a dynamic taint analysis over full-system execution traces; — automated binary code evaluation and completely automatic solutions for many everyday analysis tasks.

Trawl provides: — Modular platform architecture (allows extending the set of supported processor architectures and implementing new functional features); — Support for automating analysis via scripts and an open API (allows for integrating Trawl with other tools such as IDA Pro or Wireshark); — In-depth analysis: — only the binary code is required for analysis; — the approach is based on dynamic analysis of full-system execution traces, possibly augmented with static memory snapshot analysis; — automatic lifting of intermediate representation performed prior to the main analysis phase; — static program structure recovery for analyzed system’s parts, which is also supported using multiple program runs; — precise data flow analysis that accounts for hardware specifics (instruction pipeline, interrupts, virtual address translation, DMA); — interactive recovery of algorithm flowcharts based on con- 58 structing information flow slices; audience? target Who isTrawl workflow Trawl architectures platforms and Supported debugging, taintanalysis analysis tracing, tools: controlled execution deterministic replaydeterministic system being environment analyzed

— — — — — — — GUI: Advanced — performance: High — that runsoutsideanOS.that for anunknown code code OS andthe analyzingthe supports ARMv6, ARMv7. processor, atleast 16GiBRAM. regard analysis andsoftware to specific requirements. — — — — — — — Supported target operating systems:Supported Windows, Linux. Trawl target architectures: processor Supported x86, x86-64, Hardware requirements: Windows orLinuxOS, a64-bitx86 laboratories.Certification software andOS components; embedded Developers of Malware research laboratories; Technical feature possible with with enhancement support marking traces external with events (network communica arguments andreturn values view for functions; called high-level markupof automatic trace structure: processes execution trace provide views that search many kindsof forresearchedsupport analyzingthe system userscenari parallel worksta highscalabilityonmulticore analysis with approachthe implementedinTrawl isnotvulnerable to tions, userinputandoutput)hardware-related events. andsymbols; modules loaded and threads, interrupt handlers, callstacks, dynamically forwardboth andbackward intime; alongdataflows navigation instantaneous of possibility the similarto and navigation conventional debuggers, butwith take toos that run. alongtime tions; techniques.most known anti-analysis interactive work execution network traces traces API traffic analysis algorithms and algorithms description of of description (Wireshark) data formats preliminary representation information flow research flowchart construction level lifting Trawl example model model IDA Pro IDA - - - 59 Catalogue of Technologies ISP RAS: an innovation ecosystem

ISP RAS activities are aimed at deploying fundamental research results in industry. The institute’s business model consists of three closely related activities producing a syner- gistic effect:

— project-oriented fundamental and applied research aimed at creating new technologies (under contracts with Russian and foreign companies, the Ministry of Science and Higher Education of Russia, RAS programs, grants from the Russian Foundation for Basic Research and from Advanced Research Foundation, etc.); — deploying new technologies in partner companies and devel- oping innovations based on industry feedback; — educating students and postgraduates based on developed technologies (while participating in the institute’s research and industrial projects).

This model of industrial research plus education is well known and applied in research laboratories of leading univer- sities (Stanford, MIT, Berkeley, Carnegie Mellon) and industrial giants (IBM, Intel), as well as in state research centers (INRIA, Fraunhofer). When implemented effectively, the model solves the problem of the gap between science and industry, and produces highly qualified specialists in system programming.

Fundamental Fundamental research and experimental works are neces- research sary elements of the institute’s activities, allowing it to move in line with the latest trends in the IT world, as well as to generate own ideas for projects with business partners. ISP RAS works on a large number of scientific and educational programs and cooperates with leading Russian and foreign universities and scientific centers (ITRI (Taiwan), University of Passau (Germany), Israel Institute of Technology Technion, University of Belgrade, etc.). This allows providing high quality research results, and ISP RAS reputation in academic and university circles helps presenting its technologies on inter- national markets.

ISP RAS publishes its own journal called “Proceedings of the ISP RAS” (indexed in the Russian Science Citation Index (RSCI) and Scopus). The institute is also responsible for pub- lishing and editing the RAS journal called “Programming” (in- dexed in the Web of Science and Scopus). Both are included in the journal list of Higher Attestation Commission (the VAK). 60 collaboration Scientific Deployment Open source — — — as possible. everything capitalgrows todoes ensure this asquickly that institute and reputation are capital,andthe their contribution researcher ITcommunity. international isvisibleinthe Their tute’s research technologies. Open each means young that for insti awork andatool motivation promoting the both research of ISPRAS, openness resultspolicies. For the is itsauthor,visibility of contradicts whichoften ITcorporate result’s the implies activity andthe openness Scientific is considered as: for system modern programming.necessary source Open source software is absolutely open widelyused that tem isthe created the ecosys of components mostimportant the One of University universities andresearch andother institutes. Tomsk (RAS), Linguistics of Institute the formed with State platform. Theresearch Lingvodoc the uses isper that virtual laboratory endangered languages for documenting projects for industrialenterprises. Finally, thereisalinguistic implementsresearchcloud platform. Thelabsuccessfully Fanlight problems onthe mechanics based continuum Alsothere isalaboratorysystem components). for solving andoperating technology oncompiler (mainly focused improving for security Android andTizenOS) andHuawei including technology oncompiler (focused Electronics Currently, hasjoint laboratories institute Samsung the with ­ specialists system programming andorganizing training the young of newly emergingincreasing inthe areas competencies of allowthey planningflexibly available resources aswell as in aformjointlaboratory. of funding, Having permanent organized canbe ISPRAS Long-term with cooperation RusBITech,ones: GosNIIAS, BasaltSPO, VimpelCom, Swemel. Russian andalsothe Synchro LinuxFoundation, Software), Huawei,Samsung, HP, Systems (formerly Bentley Intel, Nvidia, for ISPRAS more fivewith than years. include companies Such long-termformed with partners whohave collaborating been institute’sthe contract the work technologies. Mostof isper and research organizations, whichuseandpromote further deploysISP RAS itsresearch results invarious industrial Center created to develop andpromote standards. Linuxopen hasaRussianLinuxOS institute Verification the In addition, used toused train engineers. infrastructure source projects open canbe international of andservices; of products outsourcing contracts butinteracting globalmarket with standards; nologies, includingready-to-use software products andopen a powerful educational resource environmenta powerful asthe educational and an abilityto innovative ensure institute the research without tech provides to allmodern free that a tool access legitimate with the skills needed by partners. skillsneeded the with - - - - - 61 Catalogue of Technologies Education The ISP RAS innovation ecosystem cornerstone is educational activity, which is carried out in several directions:

— Cooperating with leading universities. Institute specialists are working on system programming chairs of MSU, MIPT and HSE. In the first year of study at ISP RAS, third-year students listen to lectures, attend special seminars, get acquainted with the institute’s scientific directions, participate in projects and receive a special scholarship. By the time of graduation, many students have scientific publications and become system programming experts. — Scholarship program. ISP RAS launched a special scholarship program for students and postgraduates of MSU, MIPT, HSE, Novgorod State University, Russian-Armenian University, etc. — ISP RAS postgraduate study helps gaining practical expe- rience and learning new technologies at the same time. Postgraduates are actively involved in education: they instruct seminars and practical classes for students, supervise term papers and theses. After getting a degree they usually become leaders of small research groups. — System programming labs network. Currently, ISP RAS external labs are working and growing in Yerevan and Veliky Novgorod, as well as in Orel State University and Security Guard Service Federal Academy. The laboratories attract successful students and involve them in the development of promising technolo- gies in close cooperation with industry.

In 2017 ISP RAS together with Samsung opened the Samsung IoT Academy based on MIPT. In the Academy students study IoT use cases in various industries and make their own proto- types of IoT devices. In 2018, the second part of this project was launched at the MIPT Department of Aeromechanics and Flight Engineering in Zhukovsky.

62 of Notes

63