Recursive Mergesort. CSE 589 Applied Algorithms Spring Merging Pattern of Recursive Mergesort. Mergesort Call Tree. Reorder the Merging Steps

Similar documents
COS 226 Algorithms and Data Structures Fall Midterm

Heap and Merge Sorts

Sorting: Merge Sort. College of Computing & Information Technology King Abdulaziz University. CPCS-204 Data Structures I

UCB CS61C : Machine Structures

Basic Algorithms Overview

Quorums. Christian Plattner, Gustavo Alonso Exercises for Verteilte Systeme WS05/06 Swiss Federal Institute of Technology (ETH), Zürich

Allreduce for Parallel Learning. John Langford, Microsoft Resarch, NYC

Outline. Uninformed Search. Problem-solving by searching. Requirements for searching. Problem-solving by searching Uninformed search techniques

Predictive Coding. CSE 390 Introduction to Data Compression Fall Entropy. Bad and Good Prediction. Which Context to Use? PPM

Deep Neural Networks [GBC] Chap. 6, 7, 8. CS 486/686 University of Waterloo Lecture 18: June 28, 2017

Overview of the ATLAS Fast Tracker (FTK) (daughter of the very successful CDF SVT) July 24, 2008 M. Shochet 1

Artificial Intelligence Prof. Deepak Khemani Department of Computer Science and Engineering Indian Institute of Technology, Madras

Distributed Hash Tables

CMSC 341 Snow Day Chat March 14, 2017

MITOCW watch?v=4hrhg4euimo

DPaxos: Managing Data Closer to Users for Low-Latency and Mobile Applications

Bigdata High Availability Quorum Design

APRIL 2017 KNX DALI-Gateways DG/S x BU EPBP GPG Building Automation. Thorsten Reibel, Training & Qualification

Radiomics for Disease Characterization: An Outcome Prediction in Cancer Patients

Gaia DR2. The first real Gaia catalogue. Michael Biermann on behalf of the Gaia Data Processing and Analysis Consortium DPAC

Artificial Intelligence: Valid Arguments and Proof Systems. Prof. Deepak Khemani. Department of Computer Science and Engineering

MISSOURI S FRAMEWORK FOR CURRICULAR DEVELOPMENT IN MATH TOPIC I: PROBLEM SOLVING

Inverse Relationships Between NAO and Calanus Finmarchicus

Smith Waterman Algorithm - Performance Analysis

TÜ Information Retrieval

The Fallacy in Intelligent Design

Kant Lecture 4 Review Synthetic a priori knowledge

Carolina Bachenheimer-Schaefer, Thorsten Reibel, Jürgen Schilder & Ilija Zivadinovic Global Application and Solution Team

2.1 Review. 2.2 Inference and justifications

Grids: Why, How, and What Next

Trail Tree Newsletter March 2017

Artificial Intelligence Prof. Deepak Khemani Department of Computer Science and Engineering Indian Institute of Technology, Madras

Conversations with God Spiritual Mentoring Program

Find A Professional Dentist For Special Needs Children

Why Christians should not use the Kalaam argument. David Snoke University of Pittsburgh

Load balanced Scalable Byzantine Agreement through Quorum Building, with Full Information

Insights and Learning From September 21-22, 2011 Upper Midwest Diocesan Planners Meetings

Flexible Destiny: Creating our Future

NPTEL NPTEL ONLINE COURSES REINFORCEMENT LEARNING. UCB1 Explanation (UCB1)

ABB STOTZ-KONTAKT GmbH ABB i-bus KNX DGN/S DALI Gateway for emergency lighting

Macro Plan

The Ross Letter: Paul Byer s Account of How Manuscript Bible Study Developed and Its Significance

Logic and Artificial Intelligence Lecture 26

Steady and Transient State Analysis of Gate Leakage Current in Nanoscale CMOS Logic Gates

logic is everywhere Logik ist überall Hikmat har Jaga Hai Mantık her yerde la logica è dappertutto lógica está em toda parte

Deep Map Wireframe Draft

occasions (2) occasions (5.5) occasions (10) occasions (15.5) occasions (22) occasions (28)

Consciousness on the Side of the Oppressed. Ofelia Schutte

Actuaries Institute Podcast Transcript Ethics Beyond Human Behaviour

Priesthood Restoration Site Visitor Center Water Systems

Passenger Management by Prioritization

Israel, The Universal Constant in Cyclical Time Commentary on Parashat Ha azinu

Building Up the Body of Christ: Parish Planning in the Archdiocese of Baltimore

CAT MODULES. * 1. It could take a number of months to complete a pastoral transition. During that time I intend to be

A Pastorate Meeting for Saint Mary Saint Francis Holy Family November 30, 2016

NPTEL NPTEL ONINE CERTIFICATION COURSE. Introduction to Machine Learning. Lecture-59 Ensemble Methods- Bagging,Committee Machines and Stacking

(Refer Slide Time 03:00)

Formalizing a Deductively Open Belief Space

A House Divided: GIS Exercise

Examining the nature of mind. Michael Daniels. A review of Understanding Consciousness by Max Velmans (Routledge, 2000).

Lazy Functional Programming for a survey

Russell s Problems of Philosophy

HAPPINESS UNLIMITED Summary of 28 episodes conducted by Sister BK Shivani on Astha TV

Digital Logic Lecture 5 Boolean Algebra and Logic Gates Part I

A Lecture by Drunvalo Melchizedek

MODULE 8: MANIFESTING THROUGH CLARITY

What Is On The Final. Review. What Is Not On The Final. What Might Be On The Final

How Can Science Study History? Beth Haven Creation Conference May 13, 2017

Remember the Sabbath. Exodus 20: 8-11; Deuteronomy 5: 12-15

الریاضیات والتصمیم ٢٠٠١

Agency Info The Administrator is asked to complete and keep current the agency information including web site and agency contact address.

ECE 5424: Introduction to Machine Learning

9/7/2017. CS535 Big Data Fall 2017 Colorado State University Week 3 - B. FAQs. This material is built based on

ECE 5424: Introduction to Machine Learning

COS 523: Evangelism Garrett-Evangelical Theological Seminary 2121 Sheridan Road Evanston, IL

KEEP THIS COPY FOR REPRODUCTION Pý:RPCS.15i )OCUMENTATION PAGE 0 ''.1-AC7..<Z C. in;2re PORT DATE JPOTTYPE AND DATES COVERID

SPIRARE 3 Installation Guide

RALLY! THE CHRISTIAN ARRAY AN E-MAGAZINE DEDICATED TO SUSTAINED SCRIPTURAL CHURCH GROWTH IN OUR GENERATION NOT YOUR DADDY S GOSPEL MEETING!

Evangelical Lutheran Church in Canada Congregational Mission Profile

Distributed Systems. 11. Consensus: Paxos. Paul Krzyzanowski. Rutgers University. Fall 2015

A QUICK PRIMER ON THE BASICS OF MINISTRY PLANNING

MARCH 11, 2014 MINUTES PLANNING COMMISSION COUNCIL CHAMBERS (MACKENZIE HALL)

Foundations of World Civilization: Notes 2 A Framework for World History Copyright Bruce Owen 2009 Why study history? Arnold Toynbee 1948 This

Windstorm Simulation & Modeling Project

Biometrics Prof. Phalguni Gupta Department of Computer Science and Engineering Indian Institute of Technology, Kanpur. Lecture No.

Fatalism and Truth at a Time Chad Marxen

Auroville Core Mobility Group, Final Presentation, 30 April 2010, 1

SWOT Analysis Religious Cultural Tourism

Laboratory Exercise Saratoga Springs Temple Site Locator

Outline of today s lecture

Excel Lesson 3 page 1 April 15

THE CATHOLIC COMMUNITY STRATEGIC PLANNING OUTLINE OF TAUNTON ST. JUDE THE APOSTLE ST. ANDREW THE APOSTLE ST. ANTHONY ST. MARY ANNUNCIATION OF THE LORD

A Dentist On that You Can Trust

3:16 - The Code For Your Life: Elementary Edition By Max Lucado READ ONLINE

If there is one thing that needs to be different right now, this is it!

IMAGES OF JESUS CHRIST IN ISLAM

NEOPOST POSTAL INSPECTION CALL E-BOOK

The Book of Nathan the Prophet Volume II

The Fixed Hebrew Calendar

RAHNER AND DEMYTHOLOGIZATION 555

Transcription:

Recursive Mergesort CSE 589 Applied Algorithms Spring 1999 Cache Performance Mergesort Heapsort A[1n] is to be sorted; B[1n] is an auxiliary array; Mergesort(i,j) {sorts the subarray A[ij] } if i < j then k := (i+j)/; Mergesort(i,k); Mergesort(k+1,j); Merge A[ik] with A[k+1j] into B[ij]; Copy B[ij] into A[ij]; CSE 589 - Lecture 8 - Spring 1999 Mergesort Call Tree Merging Pattern of Recursive Mergesort 1/ cache size CSE 589 - Lecture 8 - Spring 1999 3 CSE 589 - Lecture 8 - Spring 1999 4 Notes on Recursive Mergesort Reorder the Merging Steps Oblivious recursion The subarrays that are d do not depend on the particular keys, just on the Lots of copying from the auxiliary array to the source arrays Recursion is elegant, but is it really needed? Sorting very small arrays should be done inplace CSE 589 - Lecture 8 - Spring 1999 5 CSE 589 - Lecture 8 - Spring 1999 6 1

Interative Mergesort Interative Mergesort Access Pattern Sort small groups in-place Alternate the roles of A and B as the source of the merging passes Copy B to A if needed at the end in-place sort groups of 4; sorted groups of 4 in A into sorted groups of 8 in B; sorted groups of 8 in B into sorted groups of 16 in A; sorted groups of 16 in A into sorted groups of 3 in B; in the end if the sorted array is B then copy it to A; CSE 589 - Lecture 8 - Spring 1999 7 copy CSE 589 - Lecture 8 - Spring 1999 8 Analysis of Access Pattern Performance of Iterative Mergesort one pass to sort into groups of 4 Pass touches n key locations log (n/4) passes Each pass touches n key locations, n in the source array and n in the destination array One copy pass if log (n/4) is odd Pass touches n key locations cycles per key 1 8 6 4 4 16 64 6 4 496 iterative sort Alpha MB L cache 3 Byte cache line 4 keys/cache line CSE 589 - Lecture 8 - Spring 1999 9 CSE 589 - Lecture 8 - Spring 1999 Cache Performance Matters Processor speeds increasing faster than memory speeds Cache miss penalties can be cycles and are growing Algorithm design can be used to reduce cache misses and improve overall performance processor Cache Model cache block or line memory Direct mapped cache Cache line Cache hit Cache miss Cache parameters Cache capacity Cache line size Set associativity CSE 589 - Lecture 8 - Spring 1999 11 CSE 589 - Lecture 8 - Spring 1999 1

Cache Miss Terminolgy Types of misses Compulsory miss: first time a memory block is read Capacity miss: accessed data does not fit in cache Conflict miss: several active memory blocks map to the same place in the cache Locality reduces cache misses temporal locality: a location that was recently accessed is accessed again Spatial locality: data on the same block are accessed together cycles per key 1 8 6 4 Cache Conscious Mergesort Execution Performance 4 16 64 6 4 496 iterative sort cache conscious sort Alpha MB L cache 3 Byte cache line 4 keys/cache line CSE 589 - Lecture 8 - Spring 1999 13 CSE 589 - Lecture 8 - Spring 1999 14 Cache Conscious Mergesort Partition problem into tiles that fit in the cache Mergesort the tiles Merge the tiles Avoid copying by sorting in-place into groups of or 4 depending on whether log (n/4) is odd or even Cache Conscious Mergesort sort in-place sort in-place 1/ cache size CSE 589 - Lecture 8 - Spring 1999 CSE 589 - Lecture 8 - Spring 1999 16 Traversal Analysis Not in cache In cache Traversal Longer than Cache cache size 1/B misses per access where B is number of access per line CSE 589 - Lecture 8 - Spring 1999 17 CSE 589 - Lecture 8 - Spring 1999 18 3

Analysis of Cache Misses Iterative Mergesort Cache Misses Parameters B keys per cache line C cache lines in the cache n keys with n >> BC Iterative Mergesort 1 n n + log + log mod cache misses per key B B 4 B 4 in-place sort passes copy copy CSE 589 - Lecture 8 - Spring 1999 19 CSE 589 - Lecture 8 - Spring 1999 Cache Conscious Merge Sort Analysis + B B log sort each tile n BC cache misses per key final passes Tile size is BC/ n/(bc/) tiles to be d in the end This take log (n/(bc/)) passes Cache Conscious Misses sort in-place sort in-place CSE 589 - Lecture 8 - Spring 1999 1 CSE 589 - Lecture 8 - Spring 1999 Simulated Cache Performance Instruction Counts cache misses per key 1 8 6 4 4 16 64 6 4 496 iterative sort cache conscious sort Atom cache simulation MB L cache 3 Byte cache line 4 keys/cache line instructions per key 18 16 14 1 8 6 4 4 16 64 6 4 496 iterative sort cache conscious sort Atom simulation CSE 589 - Lecture 8 - Spring 1999 3 CSE 589 - Lecture 8 - Spring 1999 4 4

What About Recursive Mergesort? 1/ cache size Cache hits Cache misses Notes on Cache Performance Before trying cache conscious algorithm design you should ask if performance is really a problem if not, then don t tinker if so, then check out the algorithm and data structures first Going from an n algorithm to a n log n algorithm can make a world of difference if the algorithm and data structures are basically good then consider a cache conscious design CSE 589 - Lecture 8 - Spring 1999 CSE 589 - Lecture 8 - Spring 1999 6 Some Guiding Principles Sacrifice instructions for better cache performance Knowing architectural constants can lead to better algorithms Cache capacity, line size Small memory footprints are good Reduces capacity misses Block data into cache size pieces Reduces capacity misses Fully utilize cache lines Improves spatial locality Heapsort Classic in-place, O(n log n) sorting algorithm Uses the binary heap, an elegant priority queue data structure (insert and delete-max) Perfectly balanced tree with the heap property Each node is larger than its children CSE 589 - Lecture 8 - Spring 1999 7 CSE 589 - Lecture 8 - Spring 1999 8 Insert Insert ()? 7 3 8? 1 CSE 589 - Lecture 8 - Spring 1999 9 CSE 589 - Lecture 8 - Spring 1999 3 5

Insert (3) Insert (3)? 7 3 8 1 7 3 8 1 CSE 589 - Lecture 8 - Spring 1999 31 CSE 589 - Lecture 8 - Spring 1999 3 Delete-Max Delete-Max () CSE 589 - Lecture 8 - Spring 1999 33 CSE 589 - Lecture 8 - Spring 1999 34 Delete-Max (3) 5 9? Delete-Max (4) 9? 5 CSE 589 - Lecture 8 - Spring 1999 35 CSE 589 - Lecture 8 - Spring 1999 36 6

Delete-Max (5) 1 9? 7 3 8 5 Delete-Max (5) 1 7 9 3 8 5 CSE 589 - Lecture 8 - Spring 1999 37 CSE 589 - Lecture 8 - Spring 1999 38 Analysis of the Heap Operation Implicit Pointers Insert - O(log n) worst case Each percolate up goes up at most log n levels Often O(1) in practice because keys do not percolate far Delete-Max - O(log n) worst case Percolates down tend to go close to the leaves of the heap 1 3 4 5 6 7 8 9 1 3 4 5 6 7 8 9 11 parent of i is (i-1)/ children of i are i+1, i+ CSE 589 - Lecture 8 - Spring 1999 39 CSE 589 - Lecture 8 - Spring 1999 4 Heapsort Williams 1964 We will sort the array A[n-1] in-place Build a heap in-place For i = n-1 to 1 A[i] := delete-max; 1 3 4 5 6 7 8 9 1 7 9 3 8 5 1 9 7 5 3 8 Invariants Heap Sorted < 9 8 7 5 3 1 9 7 8 5 3 1 CSE 589 - Lecture 8 - Spring 1999 41 7