Smith Waterman Algorithm - Performance Analysis

Similar documents
Performance Analysis with Vampir

PARSEC An R package for PARtial orders in Socio- EConomics Alberto Arcagni and Marco Fattore

The contents of this document are made using Alan Nguyen s Brain Juices.

COS 226 Algorithms and Data Structures Fall Midterm

Allreduce for Parallel Learning. John Langford, Microsoft Resarch, NYC

Elgin High, Church of Scotland. Survey of New Elgin residents & Elgin High School pupils

Quorums. Christian Plattner, Gustavo Alonso Exercises for Verteilte Systeme WS05/06 Swiss Federal Institute of Technology (ETH), Zürich

Tuen Mun Ling Liang Church

Laboratory Exercise Saratoga Springs Temple Site Locator

Sorting: Merge Sort. College of Computing & Information Technology King Abdulaziz University. CPCS-204 Data Structures I

Investigating I/O approaches to improve performance and scalability of the Ocean-Land-Atmosphere Model

TRAMPR: A package for analysis of Terminal-Restriction Fragment Length Polymorphism (TRFLP) data

Deep Neural Networks [GBC] Chap. 6, 7, 8. CS 486/686 University of Waterloo Lecture 18: June 28, 2017

Lazy Functional Programming for a survey


Grade 6 Math Connects Suggested Course Outline for Schooling at Home

Clustering. ABDBM Ron Shamir

Slides by: Ms. Shree Jaswal

Sociology Exam 1 Answer Key February 18, 2011

Predictive Coding. CSE 390 Introduction to Data Compression Fall Entropy. Bad and Good Prediction. Which Context to Use? PPM

9/7/2017. CS535 Big Data Fall 2017 Colorado State University Week 3 - B. FAQs. This material is built based on

Prioritizing Issues in Islamic Economics and Finance

Prentice Hall World Geography: Building A Global Perspective 2003 Correlated to: Colorado Model Content Standards for Geography (Grade 9-12)

Curriculum Guide for Pre-Algebra

Healthy Churches. An assessment tool to help pastors and leaders evaluate the health of their church.

PHILOSOPHY AND RELIGIOUS STUDIES

Anaphora Resolution in Biomedical Literature: A

Basic Algorithms Overview

MISSOURI S FRAMEWORK FOR CURRICULAR DEVELOPMENT IN MATH TOPIC I: PROBLEM SOLVING

Appendix A: Scaling and regression analysis

Gesture recognition with Kinect. Joakim Larsson

Automatic Recognition of Tibetan Buddhist Text by Computer. Masami Kojima*1, Yoshiyuki Kawazoe*2 and Masayuki Kimura*3

Church Vision and Strategic Plan 3 Year Outlook

Grade 7 Math Connects Suggested Course Outline for Schooling at Home 132 lessons

Finite Mathematics And Calculus With Applications 9th Edition By Lial Greenwell Ritchey

Steady and Transient State Analysis of Gate Leakage Current in Nanoscale CMOS Logic Gates

McDougal Littell High School Math Program. correlated to. Oregon Mathematics Grade-Level Standards

Appendix 1. Towers Watson Report. UMC Call to Action Vital Congregations Research Project Findings Report for Steering Team

Case Study: South Africa

Torah Code Cluster Probabilities

Greater New York Survey of Members

Heap and Merge Sorts

Survey of Members Midland SDA Church

The World Wide Web and the U.S. Political News Market: Online Appendices

Albeo LED Luminaire. GE Lighting. ABHG Series DATA SHEET. Optics. Product information. Installation. Structures and materials.

Factors related to students spiritual orientations

Artificial Intelligence Prof. Deepak Khemani Department of Computer Science and Engineering Indian Institute of Technology, Madras

Artificial Intelligence: Valid Arguments and Proof Systems. Prof. Deepak Khemani. Department of Computer Science and Engineering

Strengthening Catholic Identity St Francis Xavier, Goodna

NCLS Occasional Paper Church Attendance Estimates

Overview of College Board Noncognitive Work Carol Barry

Philosophy 200: Introduction to Philosophy. Spring Dr. Bill E. Lawson. Tuesday and Thursday 10:20 11:40 am

Recursive Mergesort. CSE 589 Applied Algorithms Spring Merging Pattern of Recursive Mergesort. Mergesort Call Tree. Reorder the Merging Steps

Apologetics. Course Description. Rationale. Prerequisite. Biblical Integration Outcomes. Measurable Learning Outcomes BIB1000

CHURCH FACILITIES AND MINISTRY SUCCESS. John A. Holm Lead Researcher March, Church Facilities Satisfaction & Ministry Success Study

PROPINSIGHT A Detailed Property Analysis Report

This report is organized in four sections. The first section discusses the sample design. The next

Grade 6 correlated to Illinois Learning Standards for Mathematics

COMPETENCIES QUESTIONNAIRE FOR THE ORDER OF MINISTRY Christian Church (Disciples of Christ) in West Virginia

TÜ Information Retrieval

The performance of the Apriori-DHP algorithm with some alternative measures

Basic Church Profile Inventory Sample

Georgia Quality Core Curriculum

From Machines To The First Person

Transition Summary and Vital Leader Profile. The Church Assessment Tool 5/3/16

Prentice Hall Biology 2004 (Miller/Levine) Correlated to: Idaho Department of Education, Course of Study, Biology (Grades 9-12)

Order-Planning Neural Text Generation from Structured Data

APPENDIX C ADDITIONAL CALCULATIONS

Comparing World Religions Using Primary Sources

ECE 5424: Introduction to Machine Learning

CHAPTER 17: UNCERTAINTY AND RANDOM: WHEN IS CONCLUSION JUSTIFIED?

Streamlined Administration Model Report to Church Council

A Linear Programming Approach to Complex Games: An Application to Nuclear Exchange Models

Jenn Lim Interview CEO, Delivering Happiness

Part 1 of 3 PRESBYTERY OF GIPPSLAND. VISION: Growing in Christ and sharing His love and hope. October 2015 UNITING CHURCH IN AUSTRALIA

Has not Science Debunked Biblical Christianity?

The influence of Religion in Vocational Education and Training A survey among organizations active in VET

Excel Lesson 3 page 1 April 15

In Our Own Words 2000 Research Study

Board of Ordained Ministry 2017 Clergy Evaluation

EPISCOPAL LEADERSHIP EVALUATION FORM (Part A) and ANNUAL CONFERENCE PROFILE FORM (Part B) Quadrennium

ECE 5984: Introduction to Machine Learning

Brandeis University Maurice and Marilyn Cohen Center for Modern Jewish Studies

UCB CS61C : Machine Structures

Glendora Church Survey of Members

Workshop in Computer Science. Algorithms for finding sequence motifs in HT-Selex data. Introduction

Gaia's Body: Toward A Physiology Of Earth By Tyler Volk

THE SEVENTH-DAY ADVENTIST CHURCH AN ANALYSIS OF STRENGTHS, WEAKNESSES, OPPORTUNITIES, AND THREATS (SWOT) Roger L. Dudley

Welcome to the Church Planting Pipeline!

Artificial Intelligence. Clause Form and The Resolution Rule. Prof. Deepak Khemani. Department of Computer Science and Engineering

The American Religious Landscape and the 2004 Presidential Vote: Increased Polarization

Utah South Area Family History Training

Revelation The Church Celebration Life in Christ how Jesus spoke of God his Father and the Holy Spirit;

occasions (2) occasions (5.5) occasions (10) occasions (15.5) occasions (22) occasions (28)

Whatever happened to cman?

CREATING THRIVING, COHERENT AND INTEGRAL NEW THOUGHT CHURCHES USING AN INTEGRAL APPROACH AND SECOND TIER PRACTICES

Analysis of Heart Rate Variability during Meditative and Non-Meditative State using Analysis Of variance

Year 4 Medium Term Planning

Pearson myworld Geography Western Hemisphere 2011

Wearemakingdisciples.com DISCIPLESHIP ACTION PLANNING

Transcription:

Smith Waterman Algorithm - Performance Analysis Armin Bundle Department of Computer Science University of Erlangen Seminar mucosim SS 2016 Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 1 / 18

Outline 1 The Smith Waterman Algorithm The concept 2 Profiling and data structure 3 The code 4 Likwid performance measurement 5 Problems and Outlook Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 2 / 18

The Smith Waterman Algorithm The concept (1) The Smith Waterman Algorithm does local sequence alignment to find similar regions in e.g. DNA or protein sequences. A sequence alignment is a sequence of edit-operations. Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 3 / 18

The Smith Waterman Algorithm The concept (2) is a variation of the Needleman-Wunsch algorithm to compare two sequences and create a global similarity score Application area of the SW: The search for genes in which sequences are similar to well known genes The algorithem uses the method of dynamic programming The complexity is quadratic Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 4 / 18

The Smith Waterman Algorithm First step: the matrix initialisation Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 5 / 18

The Smith Waterman Algorithm Example input data Calculation function f = -1 MatchScore = 2 MismatchScore = -1 w(x, y) = { m, x=y mm, else Evaluate the neighbours 0 F (i 1, j 1) + w(x i, y i ) F (i, j) = max F (i 1, j) + f F (i, j 1) + f Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 6 / 18

The Smith Waterman Algorithm Second step: calculation of the local alignment score of the matrix Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 7 / 18

The Smith Waterman Algorithm Second step: calculation of the local alignment score of the matrix Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 8 / 18

The Smith Waterman Algorithm Third step: Traceback Matrix Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 9 / 18

Profiling and data structure Profiling Profiling Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 10 / 18

Profiling and data structure Data structure Input value (reference value is 41) Arrays sequence size = 1 <<(scale / 2) main sequence & match sequence Memory: 1mb (41) goodscores & scores Memory: 4.8 kb goodendsi, goodendsj, index & best Memory: 2.4 kb weights Memory: 0.6 kb Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 11 / 18

The code The code (1) Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 12 / 18

The code The code (2) Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 13 / 18

The code The code (3) Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 14 / 18

Likwid performance measurement Likwid performance measurement - value Branch misprediction rate 7.8e-6 Load to Store ratio 5.5 CPI 0.42 L2 bandwidth [MBytes/s] 5702 L2 data volume [GBytes/s] 606.2 L2 miss rate 0.0084 L3 bandwidth [MBytes/s] 5180 L3 data volume [GBytes/s] 550.0 Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 15 / 18

Likwid performance measurement Runtime Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 16 / 18

Problems and Outlook Problems & Outlook Problems Run the code with MPI Catching a node for memory messurements The roofline model Outlook Change the Data structure or the order of the sequence array access Use MPI to see how the performance increases Use the SIMD technology of CPUs Convert the code for GPUs Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 17 / 18

Appendix Sources I https://pressbit.wordpress.com/2014/03/07/lokalessequenzalignment-mit-dem-smith-waterman-algorithmus-in-c Mrz 7, 2014 Smith Waterman Algorithm - Performance Analysis Seminar mucosim SS 2016 18 / 18