Grids: Why, How, and What Next

Similar documents
Overview of the ATLAS Fast Tracker (FTK) (daughter of the very successful CDF SVT) July 24, 2008 M. Shochet 1

Quorums. Christian Plattner, Gustavo Alonso Exercises for Verteilte Systeme WS05/06 Swiss Federal Institute of Technology (ETH), Zürich

APAS assistant flexible production assistant

Online Mission Office Database Software

Du parchemin aux big data: naviguer sans carte dans les données? Journées du SITG 2016

KEEP THIS COPY FOR REPRODUCTION Pý:RPCS.15i )OCUMENTATION PAGE 0 ''.1-AC7..<Z C. in;2re PORT DATE JPOTTYPE AND DATES COVERID

SPIRARE 3 Installation Guide

HOW TO WRITE AN NDES POLICY MODULE

DPaxos: Managing Data Closer to Users for Low-Latency and Mobile Applications

Faculty Advisor Bryan K. Marcia, PhD

APRIL 2017 KNX DALI-Gateways DG/S x BU EPBP GPG Building Automation. Thorsten Reibel, Training & Qualification

SERVICE CENTER PROGRAM

Performance Analysis with Vampir

Carolina Bachenheimer-Schaefer, Thorsten Reibel, Jürgen Schilder & Ilija Zivadinovic Global Application and Solution Team

CREATE. CONNECT. LIVE. Ed Hepler Winner of the Qualcomm Tricorder XPRIZE

Distributed Systems. 11. Consensus: Paxos. Paul Krzyzanowski. Rutgers University. Fall 2015

Data Sharing and Synchronization using Dropbox

Bigdata High Availability Quorum Design

Instructions for Ward Clerks Provo Utah YSA 9 th Stake

Circle of Influence Strategy (For YFC Staff)

apriori Customer Use Cases See How We Have Significantly Improved Product Cost Decisions

Artificial Intelligence Prof. Deepak Khemani Department of Computer Science and Engineering Indian Institute of Technology, Madras

YAN, ZIHAN TEAM 4A CAR KINGDOM RESCUE AUTOMOBILES. Car Kingdom Rescue. By YAN, ZIHAN 1 / 10

Introduction. Selim Aksoy. Bilkent University

P2P Content Distribution BitTorrent and Spotify

Question Answering. CS486 / 686 University of Waterloo Lecture 23: April 1 st, CS486/686 Slides (c) 2014 P. Poupart 1

RootsWizard User Guide Version 6.3.0

You. Sharing Jesus. WHAT IS CONNECT US? IMPRESSIVE RESULTS. Dear Concerned Christians and Church Leaders,

TECHNICAL WORKING PARTY ON AUTOMATION AND COMPUTER PROGRAMS. Twenty-Fifth Session Sibiu, Romania, September 3 to 6, 2007

Introduction. Selim Aksoy. Bilkent University

Faculty Advisor Bryan K. Marcia, PhD

OJS at BYU. BYU ScholarsArchive. Brigham Young University. C. Jeffrey Belliston All Faculty Publications

Allreduce for Parallel Learning. John Langford, Microsoft Resarch, NYC

The Urantia Book Search Engine

Executive Summary December 2015

The Large Hadron Collider: How Humanity s Largest Science Experiment Bears Witness to God

LDS Church Resources by Brett W. Smith

Human Factors/Ergonomics for Societal Transformation: A Tale of Two Cities. Nancy J. Cooke. HFES President

Our Story with MCM. Shanghai Jiao Tong University. March, 2014

An Efficient Indexing Approach to Find Quranic Symbols in Large Texts

Q George, I understand you want to make a disclaimer about computers before we begin?

I Learned the Few Most Important Lessons of My Life in 5 Minutes or Less. By Jackson Ito

3. TRANSCENDENT TRUTH & INVISIBLE REALITIES

Connecting youth to college ministries

St. John Neumann Catholic Church Strategic Plan. May 2007

Health Information Exchange (HIE): Where We Are and What s Ahead

Features ADDICT - V3. DALI AC mains immunity with warning, higher DALI line ~20VDC, more efficient with longer battery life, a

SUMMER SOLSTICE, JUNE 19-21, 2009

What is the payment method associated in online gambling?

Passenger Management by Prioritization

NEOPOST POSTAL INSPECTION CALL E-BOOK

Recursive Mergesort. CSE 589 Applied Algorithms Spring Merging Pattern of Recursive Mergesort. Mergesort Call Tree. Reorder the Merging Steps

ICANN 45 TORONTO INTRODUCTION TO ICANN MULTI-STAKEHOLDER MODEL

ST PETER S CATHOLIC PARISH PRIMARY SCHOOL Thursday 7th September 2017 Week 8

Report on the Digital Tripitaka Koreana 2001

Information Booklet for Donors

UCB CS61C : Machine Structures

Whatever happened to cman?

Ministry Plan. Trinity Core Mission

Tamer Özsu Speaks Out On journals, conferences, encyclopedias and technology

TEST # 1 CUT PATHS FROM HOST TO IOGRP0:

COMMITTEE HANDBOOK WESTERN BRANCH BAPTIST CHURCH 4710 HIGH STREET WEST PORTSMOUTH, VA 23703

Building Up the Body of Christ: Parish Planning in the Archdiocese of Baltimore

Gateways DALIK v Programming manual

1. Lumi Plus be used by big or small kid Uses the kinetic energy to illuminate Bluetooth signal fountains, lights, sounds

Excel Lesson 3 page 1 April 15

DALI power line communication

Summer Revised Fall 2012 & 2013 (Revisions in italics)

RHODE ISLAND SOCIAL STUDIES STANDARDS, CERTIFICATE OF INITIAL MASTERY (CIM) (1999)

HP Serviceguard Quorum Server Version A Release Notes

St. Mary Help of Christians Catholic Church Long Range Planning Committee Long Range Plan November 2005

Diaspora Missions: OK We Get It! Now What? John Baxter, NextMove / Converge Jeff Moody, NextMove / Frontier Ventures

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Lexington, Massachusetts. Prepared for the Federal Aviation Administration Washington, DC 20591

Draft 11/20/2017 APPENDIX C: TRANSPORTATION PLAN FORECASTS

1. Be a committed Christian who, upon appointment, will become a member of Bendigo Baptist Church.

Artificial Intelligence Prof. Deepak Khemani Department of Computer Science and Engineering Indian Institute of Technology, Madras

Everything you should know about kartina tv

Application for curing ailments through mudra science

ECE 5984: Introduction to Machine Learning

Information Extraction. CS6200 Information Retrieval (and a sort of advertisement for NLP in the spring)

THE ADWORDS BIBLE FOR ECOMMERCE: STOP COUNTING CLICKS, START MAKING MONEY BY DAVID ROTHWELL

Transcription ICANN Durban Meeting. IDN Variants Meeting. Saturday 13 July 2013 at 15:30 local time

POLICY MANUAL CHURCH PLANTING COMMISSION (CPC) Evangelical Congregational Church

UK to global mission: what really is going on? A Strategic Review for Global Connections

Inimitable Human Intelligence and The Truth on Morality. to life, such as 3D projectors and flying cars. In fairy tales, magical spells are cast to

JOB DESCRIPTIONS. Senior Pastor. Associate Pastor. Student Ministries Director. Music Ministries Director. Children s Ministries Director

Where family comes first! Parish Communication Solutions, Inc.

Summary of Registration Changes

Punjab University, Chandigarh. Kurukshetra University, Haryana. Assistant Professor. Lecturer

Bank Chains Process in SAP

TRATEGIC PLAN. Becoming Christ-like Disciples Engaging the world!

AUTOMATION. Presents DALI

Pastor Search Survey Text Analytics Results. An analysis of responses to the open-end questions

An Interview with JEAN-LOUIS GRANGÉ OH 419. Conducted by Andrew L. Russell. 3 April Paris, France

How to secure the keyboard chain

Personal Data Protection Policy

Thank you for those nice words, John. I want to thank SRI for inviting me to this event.

Searching for sub-gev scale hidden-sector particles exploiting DESY s electron beams

Biometrics Prof. Phalguni Gupta Department of Computer Science and Engineering Indian Institute of Technology, Kanpur. Lecture No.

Auroville Core Mobility Group, Final Presentation, 30 April 2010, 1

Transcription:

Grids: Why, How, and What Next J. Templon, NIKHEF ESA Grid Meeting Noordwijk 25 October 2002

Information I intend to transfer!why are Grids interesting? Grids are solutions so I will spend some time talking about the problem and show how Grids are relevant. Solutions should solve a problem.!how are we (high-energy physicists) using Grids? What tools are available?! What s next? In particular: " Short-term (next 12 months) plans of the European DataGrid project " Longer-term needs of the HEP community " Emerging trends in Grid computing what should we watch closely for the next couple years? Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-2

High-Energy Physics New accelerator(s): Main Injector Central lab facility CDF experiment 1 mile antiprotons protons Fermilab (USA) DO experiment Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-3

Why Collide Protons & Antiprotons?!Look for particles most interesting phenomena come with carrier particles " Photoelectric effect (& solar cells) photons " Nuclear fusion pions and other mesons " Radioactive decay W and Z particles!these particles are active within nuclei (like protons or antiprotons) but we want to take them out and study them. Sometimes we see the phenomenon, but we don t know how it works finding the carrier particle helps a lot!!analogy: suppose cars occurred in nature, but were made so that you couldn t take them apart (e.g. with screwdrivers and wrenches) and you couldn t look inside Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-4

How to study sealed cars!collide them at high speed into a wall! " Look at the fragments " In some collisions, motor will fly out cars have motors!!can t take motor apart need higher speeds!!some brilliant soul realizes that high-speed, head-on collisions of two cars results in even more fragments!in high-energy physics, we re colliding our cars (protons) in order find out how the spark plugs work!at the LHC (CERN) we want to discover the particle responsible for how things in the universe have mass Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-5

The European CER Organisation N for Nuclear Research 20 European countries 2,700 staff 6,000 users

Detecting the Fragments 1 mile antiprotons protons Fermilab (USA) DO experiment Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-7

Detecting the Fragments (2) the DO detector at Fermilab Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-8

What do collisions look like? #Place event info on 3D map #Trace trajectories through hits #[ still needs work! ] #Assign type to each track #Find particles you want #Needle in a haystack! #This is relatively easy case Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-9

More complex example Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-10

Computational Implications!To reconstruct and analyze 1 event takes about 90 seconds!most collisions don t result in observable spark plug fragments could be as few as one out of a million. But we have to check them all!!computer program needs lots of calibration ; determined from inspecting results of first pass. " Refine map of detector elements " Relation between detector signal strength and particle energy deposition " Calibrate detector clocks (how many ticks per microsecond?)! Each event will be analyzed several times! Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-11

detector event event filter filter (selection (selection& reconstruction) reconstruction) Data Handling and Computation for Physics Analysis event summary data processed data raw data event event reprocessing reprocessing batch batch physics physics analysis analysis analysis objects (extracted by physics topic) event event simulation simulation interactive physics analysis

One of the four LHC detectors 40 MHz (40 TB/sec) level 1 - special hardware online system multi-level trigger filter out background reduce data volume 75 KHz (75 GB/sec) level 2 - embedded processors level 3 - PCs 5 KHz (5 GB/sec) 100 Hz (100 MB/sec) data recording & offline analysis Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-13

Computational Implications (2)! 90 seconds per event to reconstruct and analyze! 100 incoming events per second!to keep up, need either: " A computer that is nine thousand times faster, or " nine thousand computers working together!moore s Law: wait 21 years and computers will be 9000 times faster (we need them in 2006!)! Grids: make large numbers of computers work together! Four LHC experiments plus extra work: need >50k computers Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-14

A bunch of computers is not a Grid!HEP has experience with a couple thousand computers in one place BUT Putting them all in one spot leads to traffic jams CERN can t pay for it all Someone else controls your resources Can you use them for other (non-cern) work? Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-15

Distribute computers like users!most of computer power not at CERN " need to move users jobs to available CPU " data need to be close to CPU using them! Need computing resource management " How to connect users with available power?! Need data storage management " How to distribute? " What about copies? (Lots of people want access to same data)! Need authorization & authentication for access to resources! Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-16

Grids: wide-area computing! Grids implement distributed task scheduling and execution! Grids implement distributed data " Storage " Access " Replication " Management!Grids facilitate authentication, authorization, and accounting across national (continental, institutional) boundaries!grids give you potential access to 1000 s of computers, but institutes can set their own priorities for their contribution: institutes own some of the resources Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-17

What does the Grid do for you?! You submit your work, and the Grid " Finds convenient places for it to be run " Organises efficient access to your data $ Caching, migration, replication " Deals with authentication to the different sites that you will be using " Interfaces to local site resource allocation mechanisms, policies " Runs your jobs " Monitors progress and recovers from problems " Tells you when your work is complete! If your task allows, Grid can also decompose your work into convenient execution units based on available resources, data distribution Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-18

Grid Session: Matchmaking Resource Broker Information System Where are resources to do this job? Query Computational Resources User submits job description Locate Copies of Requested Data User Interface Data Management Service Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-19

Grid Session: Job Placement Resource Broker Submit Job #Available processors #On-site (or close ) copies of job data #User (or his virtual organization) allowed to run there Resource Broker decides optimal place to run job RAL Resource Broker request movement of data to chosen site Data Management Service Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-20

Resource Broker Grid Session: Job Termination Notify Broker RAL Notify user Retrieve Output Optional request to move and register large output datasets Retrieve Files User Interface Data Management Service Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-21

What s There Now?! Job Submission " Marriage of Globus and Condor-G works relatively well! Information System " Globus MDS (Metacomputing Directory Service) Problems with stability planned to be replaced with R-GMA (product from DataGrid project)! File Transfer " GridFTP works very well and uses multiple internet connections to transfer files very quickly can utilize up to 90% of available connection bandwidth! Data Management " GDMP very basic prototype, fragile, to be replaced shortly Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-22

More Stuff There Now! Cluster Management " LCFG extremely useful tool. " Used to manage about 25 machines at NIKHEF. " One server machine contains configuration for each machine type plus map of which machines should be what type " Each machine controlled by LCFG polls server every two minutes for new configuration information or software upgrades " Possible to reconfigure cluster completely in about 15 minutes (power fail story) " New machine? Little work, very quick Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-23

More Stuff There Now! Networking " Bandwidth monitoring services nearly finished $ Find out how close a computing center is to the data needed by a job " Lots of interesting monitoring tools!security " GSI from Globus works quite well in practice " User obtains certificate from Nat l Authority $ I am Jeff Templon $ Protected by passphrase " Certificate subjects distributed to places where JT has access " You can use your cert (from anywhere!) to access Grid services Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-24

More Stuff There Now! Virtual Organizations " We have ten: $ Four LHC experiments, two US HEP experiments $ Bioinformatics $ Earth Observation $ Two for development activities!each site can " Decide whether to accept individual VOs " Assign priorities to VOs! Some services have copies for each VO (e.g. Data Management) Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-25

What is on the horizon! True Replica Management " Distributed Replica Catalog each grid site keeps list of datasets present locally, with fast transparent access to lists from other sites " Data Management at job submission Resource Broker commands Data Management Service to move files to support user jobs " Strategic Data Management Service keeps track of who accessed what data from where and makes automatic movement to improve job performance! Mass Storage Support " Make mass storage (e.g. tape robots) invisible for user Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-26

What we re missing! How to do automatic program decomposition " HEP has big files full of events " Would like Grid to break up job into several pieces as many pieces as there are available processors!!grid needs to know something about how to decompose " Your file is just a bunch of bits unless you tell the Grid how to read it! Similar problems for true parallel jobs " How to distribute on-the-fly based on number of nodes available? " Are there efficient high-latency algorithms out there? Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-27

What s been hard! Collaboration distributed software construction is hard!make services work together without making them codependent! Paratrooper Programming current software survives only in controlled environment Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-28

Trends to Watch! Opportunistic Scheduling " Condor project install Grid software on desktop PCs, let outside users take spare cycles. We have 171 desktop Linux systems at NIKHEF, and mine was 98.6% idle when I wrote this! Web Services " Current Grid services are accessed over internet and advertised in information system; programs using service must already know how to do it " Web services: service registers with an information system (service registry) " Tells registry this is how a program is supposed to use my service " Sent as XML description to client programs Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-29

Example: File Transfer! Suppose my program needs to transfer output to some other machine (server)! Current situation: the worker node (where my program runs) needs to be preprogrammed for all expected protocols on all servers on all machines! Web Services: the worker-node file transfer program must be able to understand XML! Service registry provides " List of data transfer services provided by target machine " instructions (via XML) on how to use protocol each service implements! Client program contacts selected service per prescription! Grid version called OGSA, collaboration between Globus project and IBM (with support from NASA Information Power Grid) Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-30

Conclusions! Grids well-suited to providing HEP computing power!grids have advantages for strategic sharing of local and remote computing resources!we have quite a bit working already (European DataGrid project)! Still learning how to make paratrooper programs!will be very interesting to see if Web Service concept lives up to expectations Jeff Templon ESA Grid Meeting, Noordwijk, 2002.10.25-31