2. 概要
• Fifteenth International Conference on Architectural Support for
Programming Languages and Operating Systems (ASPLOS 2010)
– March 15-17, 2010
– Pittsburgh, PA
– 182Submit (今までの最高)、Accept 32(18%)、Best Paper 3本
• ポスターあり。日本から5件(東大平木研、早稲田中島研2件、九大村上研、九工大光来研)
– 参加者400名程度。
– Keynote SpeechはACM InfoSys Foundation Award の Eric Brewer (UCB)
• ワークショップ
– 2nd WIOV (Workshop I/O Virtualization)
– Workshop on Architecting Memory Technologies (これはパネルでした)
– 参加していないが Workshop on General-Purpose Computation on Graphics
Processing Units
• ASPLOS 2011はNewport Beach, California, March 5 ~ 11, 2011
– asplos11.cs.ucr.edu/
– Abstract Deadline: Monday, July 19, 2010
– Full Paper Deadline: Monday, July 26, 2010 (11:59pm EDT)
3. プログラム1日目
• Session 1: Novel Architectures (Session Chair: Luis Ceze)
– Best Paper! Dynamically Replicated Memory: Building Reliable Systems from Nanoscale Resistive
Memories
• Engin Ipek, Jeremy Condit, Edmund B. Nightingale, Doug Burger and Thomas Moscibroda (University of Rochester / Microsoft Research)
– A Power-efficient All-optical On-chip Interconnect Using Wavelength-based Oblivious Routing
• Nevin Kirman and Jose Martinez (Cornell University)
• Session 2: Compilers and Runtime Systems (Session Chair: Michael Hind)
– Best Paper! A Real System Evaluation of Hardware Atomicity for Software Speculation
• Naveen Neelakantam, David Ditzel and Craig Zilles (University of Illinois at Urbana-Champaign; Intel)
– Dynamic filtering: multi-purpose architecture support for language runtime systems
• Tim Harris, Adrian Cristal, Sasa Tomic and Osman Unsal (Microsoft Research)
• Session 3: Parallel Programming 1 (Session Chair: Yuanyuan Zhou)
– CoreDet: A Compiler and Runtime System for Deterministic Multithreaded Execution
• Tom Bergan, Owen Anderson, Joe Devietti, Luis Ceze and Dan Grossman (University of Washington)
– Speculative Parallelization Using Software Multi-threaded Transactions,
• Arun Raman, Hanjun Kim, Thomas R. Mason, Thomas B. Jablin and David I. August (Princeton University)
– Respec: Efficient online multiprocessor replay via speculation and external determinism
• Dongyoon Lee, Benjamin Wester, Kaushik Veeraraghavan, Satish Narayanasamy, Peter Chen and Jason Flinn (University of Michigan)
• Session 4: Scheduling in Parallel Systems (Session Chair: Tim Harris)
– Probabilistic Job Symbiosis Modeling for SMT Processor Scheduling
• Stijn Eyerman and Lieven Eeckhout (Ghent University)
– Request Behavior Variations
• Kai Shen (University of Rochester)
– Decoupling contention management from scheduling
• Ryan Johnson, Radu Stoica, Anastasia Ailamaki and Todd Mowry (EPFL; Carnegie Mellon University)
– Addressing Shared Resource Contention in Multicore Processors Via Scheduling
• Sergey Zhuravlev, Sergey Blagodurov and Alexandra Fedorova (Simon Fraser University)
4. プログラム2日目 (1/2)
• Session 5. Software Reliability (Session Chair: Emery Berger)
– SherLog: Error Diagnosis by Connecting Clues from Run-time Logs
• Ding Yuan, Haohui Mai, Weiwei Xiong, Lin Tan, Yuanyuan Zhou and Shankar Pasupathy (University of California, San Diego;
University of Illinois at Urbana-Champaign)
– Analyzing Multicore Dumps to Facilitate Concurrency Bug Reproduction
• Dasarath Weeratunge, Xiangyu Zhang and Suresh Jagannathan (Purdue University)
– A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs
• Sebastian Burckhardt, Pravesh Kothari, Madanlal Musuvathi and Santosh Nagarakatte (Microsoft Research)
– ConMem: Detecting Severe Concurrency Bugs Through an Effect-Oriented Approach
• Wei Zhang, Chong Sun and Shan Lu (University of Wisconsin- Madison)
• Session 6. Hardware Power and Energy (Session Chair: David Wood)
– Characterizing Processor Thermal Behavior
• Francisco J. Mesa-Martínez, Ehsan K. Ardestani and Jose Renau (University of California, Santa Cruz)
– Conservation Cores: Reducing the Energy of Mature Computations
• Ganesh Venkatesh, John Sampson, Nathan Goulding, Saturnino Garcia, Vladyslav Bryksin, Jose Lugo-Martinez, Steve Swanson
and Michael Taylor (University of California, San Diego)
– Micro-Pages: Increasing DRAM Efficiency with Locality-Aware Data Placement
• Kshitij Sudan, Niladrish Chatterjee, David Nellans, Manu Awasthi, Rajeev Balasubramonian and Al Davis (University of Utah)
5. プログラム2日目 (2/2)
• Session 7. Data Centers (Session Chair: Scott Mahlke)
– Power Routing: Dynamic Power Provisioning in the Data Center
• Steven Pelley, David Meisner, Pooya Zandevakili, Jack Underwood and Thomas Wenisch (University of Michigan)
– Joint Optimization of Idle and Cooling Power in Data Centers While Maintaining
Response Time
• Faraz Ahmad and T. N. Vijaykumar (Purdue University)
• Session 8. Hardware Monitoring (Session Chair: Peter Chen)
– Butterfly Analysis: Adapting Dataflow Analysis to Dynamic Parallel Monitoring
• Michelle Goodstein, Evangelos Vlachos, Shimin Chen, Phillip Gibbons, Michael Kozuch and Todd Mowry (Carnegie Mellon
University; Intel Labs Pittsburgh)
– ParaLog: Enabling and Accelerating Online Parallel Monitoring of Multithreaded
Applications
• Evangelos Vlachos, Michelle Goodstein, Michael Kozuch, Shimin Chen, Babak Falsafi, Phillip Gibbons and Todd Mowry (Carnegie
Mellon University; Intel Labs Pittsburgh; EPFL)
• Session 9. Parallel Programming 2 (Session Chair: Tim Harris)
– MacroSS: Macro-SIMDization of Streaming Applications
• Amir Hormati, Yoonseo Choi, Mark Woh, Manjunath Kudlur, Rodric Rabbah, Trevor Mudge and Scott Mahlke (University of
Michigan)
– COMPASS: A Programmable Data Prefetcher Using Idle GPU Shaders
• Dong Hyuk Woo and Hsien-Hsin Lee (Georgia Institute of Technology)
– Flexible Architectural Support for Fine-grain Scheduling
• Daniel Sanchez, Richard Yoo and Christos Kozyrakis (Stanford University)
6. プログラム3日目
• Session 10. Parallel Memory Systems (Session Chair: Carl Waldspurger)
– Specifying and Dynamically Verifying Address Translation-Aware Memory Consistency
• Bogdan Romanescu, Alvin Lebeck and Daniel Sorin (Duke University)
– Best Paper! Fairness via Source Throttling: A Configurable and High-Performance
Fairness Substrate for Multi-Core Memory Systems
• Eiman Ebrahimi, Chang Joo Lee, Onur Mutlu and Yale Patt (The University of Texas at Austin)
– An Asymmetric Distributed Shared Memory Model for Heterogeneous Parallel
Systems
– Isaac Gelado, Javier Cabezas, John Stone, Sanjay Patel, Nacho Navarro and Wen-mei Hwu (University of Illinois at Urbana-
Champaign; UPC)
– Inter-Core Cooperative TLB Prefetchers for Chip Multiprocessors
• Abhishek Bhattacharjee and Margaret Martonosi (Princeton University)
• Session 11. Security and Hardware Reliability (Session Chair: Vikram Adve)
– Orthrus: Efficient Software Integrity Protection on Multi-Cores
• Ruirui Huang, Dan Deng and G. Edward Suh (Cornell University)
– Shoestring: Probabilistic Soft-error Resilience on the Cheap
• Shuguang Feng, Shantanu Gupta, Amin Ansari and Scott Mahlke (University of Michigan)
– Virtualized and Flexible ECC for Main Memory
• Doe Hyun Yoon and Mattan Erez (The university of Texas at Austin)
7. Dynamically Replicated Memory: Building Reliable
Systems from Nanoscale Resistive Memories
Engin Ipek, Jeremy Condit, Edmund B. Nightingale, Doug Burger and Thomas Moscibroda
(University of Rochester / Microsoft Research)
• 次期メインメモリであるPCM(Phase Change Memory)の利用法
– 40n scale以下で作成でき高密度だが、一旦壊れると修復できない
– 壊れたページ(primary)はbackupページを用意してリカバー
– Physical -> Real 変換でPrimary とbackupのマッピングを行う
Primary Backup
page page
Xはdead byte. ここはparity
が壊れていることで判断
8. Dynamic filtering: multi-purpose architecture support for
language runtime systems
Tim Harris, Adrian Cristal, Sasa Tomic and Osman Unsal (Microsoft Research)
• メモリアクセス確認するread/write barrier命令である”dyfl”を追加す
ることでGC, Software Transactional Memory, Control&Data
Flow Integrity (XFI[OSDI06],WIT[SP08], DFI[OSDI06])を効率化
GCで使われるWrite Barrier dflyを追加したWrite Barrier
void writeBarrier(void **addr, void *tgt) { void writeBarrierDyfl(void **addr, void *tgt) {
if (inOldGen(addr) && inYoungGen(tgt)) { // T1 if ((!dyfl_card_pair(addr, tgt, 0x1)) && // A1
log(addr); // L1 (!dyfl_addr(addr, 0x2))) { // A2
}} if (inOldGen(addr) && inYoungGen(tgt)) { // T1
dyfl_set_addr(addr, 0x2); // S2
T がtest, Lがlog, Sがset, A がaddress log(addr); // L1
} else {
dyfl_set_card_pair(addr, tgt, 0x1); // S1
}}}
dyfl(i1, i2, mask, tag) // Test dynamic filter
dyfl_set(i1, i2, mask, tag) // Set dynamic filter
dyfl_clear(i1, i2, mask, tag) // Clear specific entry
dyfl_clear(tag) // Clear all with tag
疑問:hardware break pointと違うのか?
9. Micro-Pages: Increasing DRAM Efficiency
with Locality-Aware Data Placement
Kshitij Sudan, Niladrish Chatterjee, David Nellans, Manu Awasthi, Rajeev Balasubramonian
and Al Davis (University of Utah)
• 動機:MultiCoreにより細かいメモリアクセスになっている。DRAMのRow Buffer 8KBのヒ
ット率が低くなっている。下図 64byte cache block
• アクセスが多いデータを見つけ、ヒット率が高くなるようにデータを移動する(hardware
assist migration)
• OSのページサイズを1KBとし、4KB SuperPage(プロセッサのTLBにおけるページ粒度可
変機構)を使う
– 参考文献 「2.6 系カーネルに対するLinux Super Page
2.6 Linux Pageの実装と性能評価」 http://shimizu-lab.dt.u-tokai.ac.jp/thesis/master/6adgm007.pdf
•Average performance ↑ 9% (max. 18%)
•Average memory energy consumption ↓ 18% (max. 62%).
•Average row-buffer utilization ↑ 38%
10. Orthrus: Efficient Software Integrity
Protection on Multi-Cores
Ruirui Huang, Dan Deng and G. Edward Suh (Cornell University)
• 細粒度のメモリレイアウトが異なるレプリカプロセスを作成。
• 2つのプロセスの実行で、メモリアクセスが同一コンテンツ(異なるアドレス)を
しているかを検査することでBuffer OverflowやDangling Pointer検出
– Orthrus(オルトロス)はギリシャ神話の双頭の犬。ケルベロスの兄弟。
類似研究: どちらともソースコードを公開している
Diehard [PLDI06] http://prisms.cs.umass.edu/emery/
N-variant [USENIX-Security06] http://www.cs.virginia.edu/nvariant/
11. Virtualized and Flexible ECC for Main Memory
Doe Hyun Yoon and Mattan Erez (The university of Texas at Austin)
• 通常ECC用にCheck Bitが付加されているが、このcheck bitを
仮想化(Tire1 シンプル, Tire2 ストロング)し、通常のメモリ空間
にマップできるようにする。
– 利点:Bit増加を抑制する。省電力化
• DIMM(DDR2 burst4)の構成に合わせて、
– x4 DDR2 burst 4 の場合、64bit -> 4B T1EC
– x8 DDR2 burst 4 の場合、64bit -> 8B T1EC
• T2はchipkill correntを採用
13. WIOV 2009
Second Workshop on I/O Virtualization
• 参加人数 30名程度。全員自己紹介
• Storage
– SLIM: Network Decongestion for Storage Systems
• Madalin Mihailescu, Gokul Soundararajan and Cristiana Amza (University of Toronto).
– On Disk I/O Scheduling in Virtual Machines
• Mukil Kesavan, Ada Gavrilovska and Karsten Schwan (Georgia Institute of Technology).
• Networking
– Ally: OS-Transparent Packet Inspection Using Sequestered Cores
• Jen-Cheng Huang (Georgia Tech), Matteo Monchiero and Yoshio Turner (HP Labs).
– A Network Interface Card Architecture for I/O Virtualization in Embedded Systems
• Holm Rauchfuss, Thomas Wild and Andreas Herkersdorf (Technische Universitat Munchen).
– Architectural support for user-level network interfaces in heavily virtualized systems
• Florian Auernhammer and Patricia Sagmeister (IBM Research).
• Keynote by Paul Congdon (HP)
– Enabling Truly Converged Instrastructure
• Power and Performance Bottlenecks
– Redesigning Xen's Memory Sharing Mechanism for Safe and Efficient I/O
Virtualization
• Kaushik Kumar Ram (Rice University), Jose Renato Santos and Yoshio Turner (HP Labs).
– Power Aware I/O Virtualization
• Kun Tian and Yaozu Dong (Intel).
– I/O Virtualization Bottlenecks in Cloud Computing Today
• Jeffrey Shafer (Rice University).
• HP: http://sysrun.haifa.il.ibm.com/hrl/wiov2010/
– スライドが公開されている
15. Workshop on Architecting Memory Technologies
• 司会: Shih-Lien Lu, Intel Labs
• Professor Mattan Erez, University of Texas at Austin
• Professor Bruce Jacob, University of Maryland
• Professor Hsien-Hsin Lee, Georgia Tech University
• Professor Onur Mutlu, Carnegie Mellon University
• Professor Yuan Xie, Pennsylvania State University
– HP: http://web.engr.oregonstate.edu/~sllu/asplos2010 スライド公開
• 不揮発RAMへの移行、電力消費の問題、マルチコアの競合による性能低
• コアに対する最適ストレージサイズ
– Mattn Erez (Texas Austin)
FIT (Failure In Time) は故障率の表記方法として使用されます。そ
の単位は10億時間に発生する故障件数で表記されます。例えば、10
億時間に、故障が3件発生したとすると、その故障率(FIT)は3となり
ます。一般的な電子部品は、FITが10-100程度となります。故障率の
合計がシステム全体の故障率になるため、部品数が多くなればなる
ほど、故障率が上昇します
16. Vee Day1
• Keynote Talk “Transistors to Toys: Teaching Systems to
Freshmen”
– Peter M. Chen (University of Michigan)
• Debugging and Replay
– Capability Wrangling Made Easy: Debugging on a Microkernel with
Valgrind
• Aaron Pohle (Technische Universität Dresden), Björn Döbel, Michael
Roitzsch, Hermann Härtig
– Multi-Stage Replay with Crosscut
• Jim Chow, Dominic Lucchetti,Tal Garfinkel, Geoffrey Lefebvre,Ryan Gardner,Joshua
Mason, Sam Small, Peter M. Chen (University of Michigan)
– Optimizing Crash Dump in Virtualized Environments
• Yijian Huang (Fudan University), Haibo Chen, Binyu Zang
17. Vee Day2
• Keynote Talk, “Looking Beyond a Singularity”
– Galen C. Hunt (Microsoft Research)
• Compiler Infrastructure
– Improving Compiler-Runtime Separation with XIR
• Ben L. Titzer (Google), Thomas Würthinger, Doug Simon, Marcelo Cintra
– VMKit: A Substrate for Managed Runtime Environments
• Nicolas Geoffray (Université Pierre et Marie Curie),Gaël Thomas, Julia Lawall , Gilles Muller , Bertil Folliot
• Featured Talk “Spice up your browser: NaCl, Pepper, and beyond”
– Robert Muth (Google)
• Applications of Virtualization
– Neon: System Support for Derived Data Management
• QiUniversity of California, San Diego), John McCullough, Justin Ma, Nabil Schear, Michael Vrable (University of
California, San Diego), Amin Vahdat, Alex C. Snoeren, Geoffrey M. Voelker, Stefan Savage
– Energy-Efficient Storage in Virtual Machine ng Zhang (Environments
• Lei Ye (University of Arizona), Gen Lu, Sushanth Kumar, Chris Gniady, John H. Hartman
• Hypervisor Scheduling
– AASH: An Asymmetry-Aware Scheduler for Hypervisors
• Vahid Kazempour , Ali Kamali , Alexandra Fedorova (Simon Fraser University)
– Supporting Soft Real-Time Tasks in the Xen Hypervisor
• Min Lee (Georgia Institute of Technology), A. S. Krishnakumar (Avaya Laboratories), P. Krishnan
, Navjot Singh, Shalini Yajnik
18. Vee Day3
• Java
– Efficient Runtime Tracking of Allocation Sites in Java
• Rei Odaira (IBM Research - Tokyo), Kazunori Ogata, Kiyokuni Kawachiya,
Tamiya Onodera (IBM Research - Tokyo), Toshio Nakatani
– Evaluation of a Just-In-Time Compiler Retrofitted for PHP
• Michiaki Tatsubori (IBM Research - Tokyo), Akihiko Tozawa, Toyotaro
Suzumura, Scott Trent, Tamiya Onodera,
– Novel Online Profiling for Virtual Machines
• Manjiri A. Namjoshi (University of Kansas), Prasad A. Kulkarni
• Dynamic Binary Translation
– DBT Path Selection for Holistic Memory Efficiency and Performance
• Apala Guha (University of Virginia), Kim Hazelwood, Mary Lou Soffa
– Dynamic Binary Translation Specialized for Embedded Systems
• Goh Kondoh (IBM Research - Tokyo), Hideaki Komatsu
20. Capability Wrangling Made Easy: Debugging on a
Microkernel with Valgrind
Aaron Pohle (Technische Universität Dresden), Björn Döbel, Michael Roitzsch, Hermann Härtig
• L4系マイクロカーネル Fiasco.OCにValgrindを移植する方法
• メモリ管理が異なるので整合性を取る仕組みが必要
– Valgrind ではapplication(Client)のメモリ空間をValgirndが可能。OSの
インターフェースはPOSIX
– Fiasco.OCではCapabilityベース
Fiasco.OCではCapability ス
• Valgrindを使ったCapCheckによりCapabilityの移譲を検査で
きるようになった
21. AASH: An Asymmetry-Aware Scheduler for Hypervisors
Vahid Kazempour , Ali Kamali , Alexandra Fedorova (Simon Fraser University)
• 非対称マルチコア(同一ISA。Fast CoreとSlow Coreの
2種類)に対するHypervisorのスケジューラの提案
– 基本:
• Fast Coreは公平に割り当てる
• ゲスト内の構成は認識する
– Fast CoreのスレッドスケジュールはOSの仕事 ゲスト内認識
• Fast Core割り当てのプライオリティあり
– Fast Coreが空いている場合にはSlow Coreより優先して割り
当てる
– MSR (Model Specification Register)を使ってゲストOSに
Coreの変更を伝えることは今後の課題
22. AASH: An Asymmetry-Aware Scheduler for Hypervisors
• 実装
– Xen3.0のCredit Schdulerを改良
– 4 Core AMD Opteron を2つ(計8コア)
• Fast Core 2GHz 1個、Slow Core 1GHz 7個
• DVFS(Dynamic Voltage and Frequency Scaling)で設定?
• 評価
– Xenオリジナルなスケジューラより、36%良い結果
がでた。