CSCI 476: Quiz #1 - Map/Reduce + Spark - Definition of Map and Reduce - RDD (resilient, distributed dataset) - Transformations vs Actions - Programming Terms: - Lazy evaluation - Lambda function - Example Transformations: - map, filter, union, join - Example Actions: - reduce, collect, count - Performance optimizations: - cache() -- what does it do? - partitions - PageRank - What does it do? - Trace through an example (data given, fill-in transformations) - Parallel Architectures - Flynn's Taxonomy - SISD - SIMD - MISD - MIMD - Modern Classification - Data Parallel - Function Parallel - Instruction vs. Thread vs. Process (there's a tree) - Benchmarking - MIPS vs FLOPS - Peak vs. Sustained Performance - OpenMP - Shared Memory Model - Shared vs. Private - fork/join programming model - define: pragma - constructs: - parallel - for - schedule({static,guided,dynamic} [, chunk]) - reduce ( : ) - private vs. firstprivate vs. lastprivate - critical vs. master vs. single vs. atomic - C++ Threads - creation of a thread - std::cref and std::ref - mutual exclusion / mutex - lambda capture - getting return values with std::future and std::async - concept of joinable - join() or detach() - partitioning algorithm / formula - Amdahl's Law