D O K U M E N T U M A Z O N O S Í T Ó F á j l n é v : temesi_tibor_HPC.jpg B é l y e g k é p : https://dka.oszk.hu/078300/078380/temesi_tibor_HPC_kiskep.jpg F ő c í m : Az adatok optimális helye HPC környezetben B e s o r o l á s i c í m : Adatok optimális helye HPC környezetben S z e r e p : létrehozó B e s o r o l á s i n é v : Temesi U t ó n é v : Tibor I n v e r t á l a n d ó n é v : N E s e m é n y : felvéve I d ő p o n t : 2021-05-20 E s e m é n y : elérhető I d ő p o n t : 2021-04-08 D á t u m r a v o n a t k o z ó m e g j e g y z é s : Az előadás időpontja. A t í p u s n e v e : prezentáció A t í p u s n e v e : előadás M e g n e v e z é s : Prezentáció M e g n e v e z é s : Könyvtártudomány - prezentáció M e g n e v e z é s : Networkshop 2021 M e g n e v e z é s : Videotorium A j o g t u l a j d o n o s n e v e : Temesi Tibor S z e r z ő i j o g i m e g j e g y z é s e k : Jogvédett T é m a k ö r : Számítástechnika, hálózatok A l t é m a k ö r : Hardver, digitális eszközök T é m a k ö r : Számítástechnika, hálózatok A l t é m a k ö r : Számítástechnika általában T á r g y s z ó : számítástechnika M i n ő s í t ő : tárgyszó/kulcsszó T á r g y s z ó : szuperszámítógép M i n ő s í t ő : tárgyszó/kulcsszó T á r g y s z ó : adatkezelés M i n ő s í t ő : tárgyszó/kulcsszó T á r g y s z ó : adatfeldolgozás M i n ő s í t ő : tárgyszó/kulcsszó T á r g y s z ó : 2021 M i n ő s í t ő : időszak K é p a l á í r á s : Az adatok optimális helye HPC környezetben N y e r s v a g y O C R - e s s z ö v e g : Az adatok optimális helye HPC környezetben
Temesi Tibor
tibor@silicon.hu
Silicon Computers
"The confluence of AI with traditional simulations is going to transform the very nature of high performance computing.
That's the thing that I think is going be yet another sea change in how we do science."
"We're not doing our grandfather's HPC here."
Rick Stevens, Associate Lab Director, Argonne National Laboratory
www.nextplatform.com/2019/10/22/exascale-is-not-your-grandfathers-hpc
SIMULATION + AI = OVER- PROPORTIONAL STORAGE GROWTH
HPC Storage spending is forecasted to grow with a 40% higher CAGR than HPC Servers through 2024.
Projected Compound Annual Growth Rate for customer spending 2019-2024
Source: Hyperion Research, 2019 Market Results, New Forecasts and HPC Trends, April 2020
INPUT/OUTPUT PROFILES COULD NOT BE MORE DIFFERENT
Era of convergence of traditional simulation and AI requires NEW HPC storage
Traditional simulation
Mainly WRITING PETAbytes of LARGE FILES in SEQUENTIAL order @ extreme speed
INPUT DATA + ALGORITHM
CPU nodes
OUTPUT
Machine Learning
Mainly READING TERAbytes of FILES OF ALL SIZES in RANDOM order @ extreme speed
INPUT DATA + HISTORICAL OUTPUT
GPU nodes
ALGORITHM
START ANYWHERE, GO TO WHEREVER YOU
Nearly 10 TB/sec and > 700 PB capacity*
simulations for quantum computers, nuclear energy systems, fusion reactors, and precision medicines. 2021-2022
> 4 TB/sec and 30PB (All Flash) capacity*
NESAP enhance simulation, data processing, and machine learning applications
2.5 TB/sec and ~ 400 PB capacity*
nuclear stockpile, secondary national security missions, nuclear nonproliferation and counterterrorism
2023
2.5 TB/sec and ~ 100 PB capacity*
LUMI - Large Unified Modern Infrastructure
2021
1 TB/sec and ~ 200 PB capacity*
cancer research, materials science, climate science, and cosmology
2H 2021
1 TB/sec and ~ 1 PB (All Flash) capacity*
IT4I - Astrophysics, Eng., Chemistry, Material Earth Live science
1H 2021
240 GB/sec and ~ 14.5 PB capacity*
2H 2020
All Flash entry point (6U)
Up to 80/50 GB/sec read/write
and 115 TB capacity*
Disk entry point
(10U):
15 GB/sec and
315 TB capacity*
All Flash base rack: > 1TB/sec and up to 4.5 PB capacity*
Expansion rack: > 2 TB/sec and 4.6 PB*
Disk base rack: 90 GB/sec and 7.5 PB capacity*
Expansion rack: 120 GB/sec and 10 PB*
EXASCALE KÖVETELMÉNYEK
Erősebb "motorok"
Jobb kommunikáció: protokoll, hálózat, ...
Alkalmazás-orientált architektúrák
Az adatok feldolgozása
A cél, hogy az adat minél közelebb legyen a processzorhoz, amikor éppen dolgozik vele.
Az adatok olvasási/írási sebessége: nem optimális, ha motorok "gyorsan várnak" az adatra.
Nyilván nem lehet minden adat a processzor chipek belső cache memóriájában, akkor hol legyen?
"Ki" mozgassa az adatokat?
MEMÓRIÁK
Processzor belső cache
Külső memóriák:
DRAM (32 bit bus),
HBM (4096 bit bus): sok adat (AI) -> szélesebb adatutak (HBM memóriák), kisebb késleltetések
TÁROLÓK
Közös, megosztott, többrétegű tárolórendszer
Az éppen feldolgozandó adatok számára maximális írás/olvasási teljesítmény: SCRATCH: all-
flash, NVMe,...
Tárolási kapacitás az operatív adatok számára: HOME: HDD
Archív réteg: szalagkönyvtár
Automatizált adatmenedzsment
NON-VOLATILE MEMORY EXPRESS (NVMe)
A new storage protocol focused on SSDs, replacet traditional I/O stacks built on SCSI protocol, which is optimized for spinning media.
It leverages PCIe instead of SAS/SATA for greater bandwidth, IOPS and reduced latency
Designed to move beyond HDDs
Standard interface for Solid State Media
A new protocol
Media?
TÁROLÓK
Közös, megosztott, többrétegű tárolórendszer
Az éppen feldolgozandó adatok számára maximális írás/olvasási teljesítmény: SCRATCH: all-
flash, NVMe,...
Tárolási kapacitás az operatív adatok számára: HOME: HDD
Archív réteg: szalagkönyvtár
Automatizált adatmenedzsment
THREE BASIC FILE SYSTEM CONFIGURATION OPTIONS
All HDD file system
/lustre
Specifications per rack:
Up to 120 GB/sec (read/write)
Up to 10 PB usable capacity (16 TB HDD)
Scalable Storage Units (SSU)
SSU-D2 (10U)
Up to 30 GB/sec from 212 HDD
SSU-D4 (18U)
Up to 40 GB/sec
from 424 HDD
Hybrid file system
/lustre
/lustre/disk
/lustre/flash
Disk and Flash pools in the same namespace with Cray ClusterStor data services providing capacity from Disk and performance from Flash
All Flash file system
/lustre
Specifications per rack:
Up to 1,600 GB/sec (read)
Up to 1,000 GB/sec (write)
Scalable Storage Unit (SSU)
SSU-F (2U)
Up to 80 GB/sec sequential read from 24 SSD
Up to 50 GB/sec sequential write from 24 SSD
BASIC LUSTRE
Typical Lustre components
CLUSTERSTOR E1000 LUSTRE FILE SYSTEM & DMF V7
SZALAGKÖNYVTÁR
Az adatok megőrzése, archiválása, biztonsági másolata
TFINITY EXASCALE - ENTERPRISE ARCHIVE
TFinity Delivers
Over 1 Exabyte native capacity with LTO-9
Up to 858 PB native capacity with TS1160
24 x 7 x 365 operations
Dual robotics for availability and performance
Flexible Configurations
3 to 45 frames
100 to over 56,400 LTO slots/42,930 TS slots
2 to 144 tape drives
LTO-9, LTO-8, LTO-7, TS1160, TS1155, TS1150, T10K
Technology
Upgrade in 10 LTO or 9 TS slot increments
400 mounts per hour
TERAPACK ARCHITECTURE
Industry leading density
Industry’s smallest footprint
Reduce floor space requirements
Reduce tape handling
10 LTO or 9 TS11xx tapes per TeraPack
TeraPack design allows TFinity to maximize the floor
space needed for the tape archive.
TeraPacks make moving media much more efficient
Both in the TFinity and when transporting or moving tapes outside the library
STANDARD BLUESCALE SOFTWARE FEATURES
Media Lifecycle Management (MLM) - Media Health reporting
Drive Lifecycle Management (DLM) - Drive Health reporting and diagnostics
Hardware Health Management (HHM) - Library Hardware reporting and diagnostics
Standard Encryption - Built in Encryption Key Management
Data Integrity Verification (DIV) - Verify data on tape automatically as a background task without host system involvement
AutoSupport - The library will automatically create a support ticket if any critical event is encountered
Partitioning - Support for up to 16 partitions
Remote Library Connection (RLC) - Manage the library from anywhere through the same interface that is presented on the front panel touchscreen
Auto Drive Clean - Let the library automatically clean tape drives when they request to be cleaned
Compatible with all major ISV software
Spectra Logic HPC Customer References Europe
HPC/AI COMPUTE & STORAGE
HPC Storage Supports an Optimized Compute Experience for Active Workloads
INTRODUCING DATA MANAGEMENT FRAMEWORK VERSION 7
DMF Enables Data Curation, Placement and Protection Over Time & Distance
DMF INTEGRATION WITH TAPE STORAGE
Still the world's most cost effective Storage @ 1¢ per GB
DMF is certified with libraries from HPE, as well as Spectra Logic, IBM and Oracle (StorageTek)
Streams to tape drive at native rates, even for small files
Block ID positioning for fast seek
Support for latest LTO-9 and Enterprise-class drive technology
Advanced feature support for accelerated retrieval and automated library management
Supports Data Integrity Verification (DIV) and Logical Block Protection (LBP) available with Oracle T10k as well as IBM LTO and TS drives
Recommended Access Order (RAO) and
SpectraLogic's TAOS
DATA MANAGEMENT FRAMEWORK VERSION 7
Resolve Data Challenges with Data Management
Optimize Storage Utilization
Hierarchical storage management:
Integrated policy engine allows tiered data model with automated data movement to appropriate data tier, e.g. SSD/Flash tier for active data, tape and cloud for dormant/archive data
Maximize resources with automatic archival of stale or cold data
Data Protection Strategy
Safeguard petabytes of data with integrated, continuous backups
Maintain data integrity via self-describing media format and data checksums
Streamline Workflows
Accelerate workflows:
Simplify management of large file collections with data set labeling
Construct data sets using file metadata including extended attributes
Automate data workflows with integrated job scheduling
Streamline data recovery with automatic
file versioning and Point-in-Time restore
Reduce manual intervention:
Reduce script writing and script maintenance
Reduce operator errors with automatic data movement
Manage Costs
Control Storage Costs:
- Minimize TCO by moving data to most cost-effective tier for desired access level,
i.e. from flash to HDD to tape to cloud
Scale storage acquisitions incrementally depending on need, i.e. more flash for active data, more tape for archive depending on demand
Protect and future-proof data with automated migration to new storage infrastructure
Köszönöm D o k u m e n t u m n y e l v e : magyar D o k u m e n t u m n y e l v e : angol K a p c s o l ó d ó d o k u m e n t u m n e v e : Szommer Katalin: Parametrikus modellezés oktatása online térben A f o r m á t u m n e v e : PowerPoint prezentáció O l d a l a k s z á m a : 25 T e c h n i k a i m e g j e g y z é s : Microsoft Office PowerPoint 2016 M e t a a d a t a d o k u m e n t u m b a n : N A f o r m á t u m n e v e : PDF dokumentum O l d a l a k s z á m a : 25 M e t a a d a t a d o k u m e n t u m b a n : N A f o r m á t u m n e v e : HTML dokumentum T e c h n i k a i m e g j e g y z é s : HTML 5 verzió M e t a a d a t a d o k u m e n t u m b a n : N L e g j o b b f o r m á t u m : JPEG képállomány S z í n : színes T ö m ö r í t é s m i n ő s é g e : közepesen tömörített A z a d a t r e k o r d s t á t u s z a : KÉSZ S z e r e p / m i n ő s é g : katalogizálás A f e l d o l g o z ó n e v e : Nagy Zsuzsanna |