Overview
- \K{} Framework
- Ethereum Virtual Machine
- KEVM Uses
Links
- KEVM Repository: https://github.com/kframework/evm-semantics
- K Repository: https://github.com/kframework/k
- Runtime Verification: https://runtimeverification.com
- Verified smart contracts: https://github.com/runtimeverification/verified-smart-contracts
\K{} Framework
The Vision: Language Independence
Separate development of PL software into two tasks:
. . .
The Programming Language
PL expert builds rigorous and formal spec of the language in a high-level human-readable semantic framework.
. . .
The Tooling
Build each tool once, and apply it to every language, eg.:
- Parser
- Interpreter
- Debugger
- Compiler
- Model Checker
- Program Verifier
The Vision: Language Independence
Current Semantics
Many languages have full or partial \K{} semantics, this lists some notable ones (and their primary usage).
- C: detecting undefined behavior
- Java: detecting racy code
- EVM: verifying smart contracts
- LLVM: compiler validation (to x86)
- JavaScript: finding disagreements between JS engines
- P4: SDN data-layer verification
- many others ...
\K{} Specifications: Syntax
Concrete syntax built using EBNF style:
syntax Exp ::= Int | Id | "(" Exp ")" [bracket]
| Exp "*" Exp
> Exp "+" Exp // looser binding
syntax Stmt ::= Id ":=" Exp
| Stmt ";" Stmt
| "return" Exp
. . .
This would allow correctly parsing programs like:
a := 3 * 2;
b := 2 * a + 5;
return b
\K{} Specifications: Configuration
Tell \K{} about the structure of your execution state. For example, a simple imperative language might have:
configuration <k> $PGM:Program </k>
<env> .Map </env>
<store> .Map </store>
. . .
<k>
will contain the initial parsed program<env>
contains bindings of variable names to store locations<store>
contains bindings of store locations to integers
\K{} Specifications: Transition Rules
Using the above grammar and configuration:
. . .
Variable lookup
rule <k> X:Id => V ... </k>
<env> ... X |-> SX ... </env>
<store> ... SX |-> V ... </store>
. . .
Variable assignment
rule <k> X := I:Int => . ... </k>
<env> ... X |-> SX ... </env>
<store> ... SX |-> (V => I) ... </store>
Example Execution
Program
a := 3 * 2;
b := 2 * a + 5;
return b
Initial Configuration
<k> a := 3 * 2 ; b := 2 * a + 5 ; return b </k>
<env> a |-> 0 b |-> 1 </env>
<store> 0 |-> 0 1 |-> 0 </store>
Example Execution (cont.)
Variable assignment
rule <k> X := I:Int => . ... </k>
<env> ... X |-> SX ... </env>
<store> ... SX |-> (V => I) ... </store>
Next Configuration
<k> a := 6 ~> b := 2 * a + 5 ; return b </k>
<env> a |-> 0 b |-> 1 </env>
<store> 0 |-> 0 1 |-> 0 </store>
Example Execution (cont.)
Variable assignment
rule <k> X := I:Int => . ... </k>
<env> ... X |-> SX ... </env>
<store> ... SX |-> (V => I) ... </store>
Next Configuration
<k> b := 2 * a + 5 ; return b </k>
<env> a |-> 0 b |-> 1 </env>
<store> 0 |-> 6 1 |-> 0 </store>
Example Execution (cont.)
Variable lookup
rule <k> X:Id => V ... </k>
<env> ... X |-> SX ... </env>
<store> ... SX |-> V ... </store>
Next Configuration
<k> a ~> b := 2 * [] + 5 ; return b </k>
<env> a |-> 0 b |-> 1 </env>
<store> 0 |-> 6 1 |-> 0 </store>
Example Execution (cont.)
Variable lookup
rule <k> X:Id => V ... </k>
<env> ... X |-> SX ... </env>
<store> ... SX |-> V ... </store>
Next Configuration
<k> 6 ~> b := 2 * [] + 5 ; return b </k>
<env> a |-> 0 b |-> 1 </env>
<store> 0 |-> 6 1 |-> 0 </store>
Example Execution (cont.)
Variable lookup
rule <k> X:Id => V ... </k>
<env> ... X |-> SX ... </env>
<store> ... SX |-> V ... </store>
Next Configuration
<k> b := 2 * 6 + 5 ; return b </k>
<env> a |-> 0 b |-> 1 </env>
<store> 0 |-> 6 1 |-> 0 </store>
Example Execution (cont.)
Variable assignment
rule <k> X := I:Int => . ... </k>
<env> ... X |-> SX ... </env>
<store> ... SX |-> (V => I) ... </store>
Next Configuration
<k> b := 17 ~> return b </k>
<env> a |-> 0 b |-> 1 </env>
<store> 0 |-> 6 1 |-> 0 </store>
Example Execution (cont.)
Variable assignment
rule <k> X := I:Int => . ... </k>
<env> ... X |-> SX ... </env>
<store> ... SX |-> (V => I) ... </store>
Next Configuration
<k> return b </k>
<env> a |-> 0 b |-> 1 </env>
<store> 0 |-> 6 1 |-> 17 </store>
Ethereum Virtual Machine
Blockchain
- Append-only ledger of user-submitted transactions.
- Mostly simple transfers of value.
- Miners select next block of transactions using consensus algorithm.[^notThisTalk]
- Ethereum adds:
- New currency! (Ether)
- Accounts have an associated storage and code which only transactions by that account can read/modify (onchain).
- Transactions can have associated programs written in EVM (Ethereum Virtual Machine).[^smartContracts]
- Miners execute transaction code to calculate the new world state (new account balances, storage, code, etc...).
[^notThisTalk]: This talk is not about the consensus algorithms. [^smartContracts]: Cool kids call these programs smart contracts, or contracts.
EVM (Ethereum Virtual Machine)
Is a stack-based bytecode over 256-bit words.
Example program (sum 0 - 10):
\footnotesize
PUSH(1, 0) ; PUSH(1, 0) ; MSTORE
; PUSH(1, 10) ; PUSH(1, 32) ; MSTORE
; JUMPDEST
; PUSH(1, 0) ; PUSH(1, 32) ; MLOAD ; GT
; ISZERO ; PUSH(1, 43) ; JUMPI
; PUSH(1, 32) ; MLOAD ; PUSH(1, 0) ; MLOAD ; ADD ; PUSH(1, 0) ; MSTORE
; PUSH(1, 1) ; PUSH(1, 32) ; MLOAD ; SUB ; PUSH(1, 32) ; MSTORE
; PUSH(1, 10) ; JUMP
; JUMPDEST
; PUSH(1, 0) ; MLOAD ; PUSH(1, 0) ; SSTORE
; .OpCodes
\normalsize
Enforcing Termination (Gas)
- User submitted transactions could have infinite loops (DoS attacks!).
- Solution:
- Each opcode costs some gas, paid for in Ether.
- Spent gas awarded to miner, remaining refunded to account.
Notes:
- It's important to charge according to actual used compute resources.
- Tuning gas costs is an ongoing challenge.
- New hardware available new gas model.
- Hardforks update the gas constants/rules.
Intercontract Executions (and the ABI)
- Contracts can call other contracts (re-entrancy!).
- Payload to other contract is a raw string of bytes called
callData
. - External to the EVM, the Ethereum ABI has been developed:
- Specifies:
- Calling conventions (how to interpret
callData
correctly). - Some high-level types (and their mapping to EVM words).
- Calling conventions (how to interpret
EVM Nuisances
- Unstructured controlflow: dynamically calculable jump destinations.
- 256-bit words: useful for crypto, pain for hardware.
- 8-bit word-array local memory: complicates symbolic reasoning.
- No built-in calling conventions: ABI is declared external to EVM.
eval
capability: copying bytes as code and running is bad for analysis.
KEVM: Status and Current Uses
Correctness and Performance
- Pass the VMTests and GeneralStateTests of the Ethereum Test Suite.
. . .
-
Roughly an order of magnitude slower than cpp-ethereum (reference implementation in C++).
Test Set (no. tests) KEVM cpp-ethereum
VMAll (40683) 99:41 4:42 GSNormal (22705) 35:00 1:30
. . .
-
(Non-empty) lines of code comparison:
KEVM: 2644
cpp-ethereum: 4588
Assertion Violation Checker
- EVM has designated
INVALID
opcode, halts and discards gas. - Has been used to specify "assertion failure" in HLLs.[^hlls]
- Symbolically execute and look for paths ending in
INVALID
.
. . .
- Co-invariant checking - "
INVALID
never happens". - Very little tooling effort needed after building semantics.
- Similar techniques used to find undefined behaviour in C programs.
[^hlls]: High-level languages (which compile to EVM).
Reachibility Logic Prover [@stefanescu-ciobaca-mereuta-moore-serbanuta-rosu-2014-rta]
- Takes operational semantics as input (no axiomatic semantics).[^inthiscaseK]
- Generalization of Hoare Logic:
- Hoare triple:
- Reachability claim:
- "Any instance of which satisfies either does not terminate or reaches an instance of (empty program) which satisfies ". No need for end state in RL to be though.
- See paper for example claims.
. . .
- Reachability claims are fed to \K{} prover as normal \K{} rules.
- Functional correctness directly specifiable as set reachability claims.
- Correctness bugs often lead to security bugs.
[^inthiscaseK]: In this case, we use \K{} to specify the operational semantics.
Reachability Logic Inference System [@stefanescu-ciobaca-mereuta-moore-serbanuta-rosu-2014-rta]
Case Study: ERC20
- "Hello world" of Ethereum smart contracts.
- First full contract we verified.
- Key-value store of addresses to token balances.
- Simple methods for purchasing/transferring/selling tokens.
. . .
ERC20 Uses
- Codify ownership distribution of a company (tokenizing equity).
- Tokenize pink-slips/deeds (eg. tokenizing houses, cars).
- Ponzi schemes.
Case Study: ERC20 - Scaling Verification
- First ERC20 verification took several person-months.
. . .
- Define state abstractions directly in \K{} itself.
nthbyteof
for storing symbolic word (256-bits) as 32 bytes (when written tolocalMem
).#abiCallData
allows using actual function name/signature in call instead of packed bytes representation.
. . .
- Modular specifications
- Re-use same specs for different implementations, only need delta.
- Have 5 verified implementations of ERC20, all different behaviors.[^verifiedSmartContracts]
. . .
- Now, can verify a (reasonable) ERC20 in about a day.
[^verifiedSmartContracts]: See our audited ERC20s (and other contracts) at https://github.com/runtimeverification/verified-smart-contracts.
Tool-Assisted Security Audits
-
Runtime Verification, Inc. offers audits as a service, typical process:
- Specify high-level business logic in English, confirm with customer.[^usuallyNotDone]
- Refine to a mathematical definition, confirm with customer.
- Refine to a set of K Reachability Claims, confirm with customer.
- Fix bugs in contract and specification until the K prover is satisfied.
- Send all documents generated and list of issues found to customer.
-
Independent groups also using KEVM to verify contract stack (eg. DappHub).
[^usuallyNotDone]: Surprisingly, many smart contract developers don't have a high-level specification of their contracts.
Thanks!
Conclusion
- \K{} used to specify semantics of Ethereum Virtual Machine (EVM).
- Commercial verification efforts carried out with \K{} prover.
Sponsors
The work presented in this paper was supported in part by the Boeing grant on "Formal Analysis Tools for Cyber Security" 2016-2017, the NSF grants CCF-1318191, CCF-1421575, CNS-1330599, CNS-1514163, and IIP-1660186, the NSF Graduate Fellowship under Grant No. DGE-1650441, an IOHK gift, and a grant from the Ethereum Foundation.