目标:
————–
– 普及基础理论,培养读Paper的习惯
– 强化本领域的理解
– 融汇贯通,理论和实际业务相结合
– 对系统进行理论提炼并逐步培养写Paper的能力
计划:
—————-
– Overview: What is a distributed system? A case study with MapReduce.
Reading:
. MapReduce: https://static.googleusercontent.com/media/research.google.com/zh-CN//archive/mapreduce-osdi04.pdf
– RPC and Threads
Reading:
. RPC: http://www.cs.wustl.edu/~schmidt/PDF/rpc4.pdf
. SEDA: https://www.cs.cornell.edu/courses/cs614/2003sp/papers/Wel01.pdf
– Local Storage
Reading:
. LSM: http://www.cs.umb.edu/~poneil/lsmtree.pdf
. (TODO: any local storage paper with latest hardware such as NVME?)
– Distributed Storage
Reading:
. GFS: http://www.cs.cornell.edu/courses/cs614/2004sp/papers/gfs.pdf
. Dynamo: http://cloudgroup.neu.edu.cn/papers/cloud%20data%20storage/dynamo-sosp-2007.pdf
– Distributed Consensus
Reading:
. Chubby: https://research.google.com/archive/chubby.html
. Raft:https://raft.github.io/raft.pdf
– Distributed Transaction
Reading:
. PWV: http://www.vldb.org/pvldb/vol10/p613-faleiro.pdf
. Percolator: http://cloudgroup.neu.edu.cn/papers/model/incrprocess_osdi_2010.pdf
– Graph
Reading:
. GraphX: https://amplab.cs.berkeley.edu/wp-content/uploads/2014/09/graphx.pdf
. Wukong: https://www.usenix.org/conference/osdi16/technical-sessions/presentation/shi
– Streaming
Reading:
. Spark Streaming: https://people.csail.mit.edu/matei/papers/2013/sosp_spark_streaming.pdf
. Time Stream: https://www.microsoft.com/en-us/research/publication/timestream-reliable-stream-computation-in-the-cloud/
– Cluster Management
Reading:
. Mesos: https://cs.stanford.edu/~matei/papers/2011/nsdi_mesos.pdf
. Borg: https://pdos.csail.mit.edu/6.824/papers/borg.pdf
– System Reliability
Reading:
. Design & deploy: http://mvdirona.com/jrh/talksAndPapers/JamesRH_Lisa.pdf
. ErrLog: https://www.usenix.org/system/files/conference/osdi12/osdi12-final-109.pdf
. Failure Recovery Be Evil: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/FailureRecoveryBeEvil.pdf