Focus on:
All days
10 Nov 2021
11 Nov 2021
All sessions
1st session
Opening
Hide Contributions
Indico style
Indico style - inline minutes
Indico style - numbered
Indico style - numbered + minutes
Indico Weeks View
Back to Conference View
Choose Timezone
Use the event/category timezone
Specify a timezone
Africa/Abidjan
Africa/Accra
Africa/Addis_Ababa
Africa/Algiers
Africa/Asmara
Africa/Bamako
Africa/Bangui
Africa/Banjul
Africa/Bissau
Africa/Blantyre
Africa/Brazzaville
Africa/Bujumbura
Africa/Cairo
Africa/Casablanca
Africa/Ceuta
Africa/Conakry
Africa/Dakar
Africa/Dar_es_Salaam
Africa/Djibouti
Africa/Douala
Africa/El_Aaiun
Africa/Freetown
Africa/Gaborone
Africa/Harare
Africa/Johannesburg
Africa/Juba
Africa/Kampala
Africa/Khartoum
Africa/Kigali
Africa/Kinshasa
Africa/Lagos
Africa/Libreville
Africa/Lome
Africa/Luanda
Africa/Lubumbashi
Africa/Lusaka
Africa/Malabo
Africa/Maputo
Africa/Maseru
Africa/Mbabane
Africa/Mogadishu
Africa/Monrovia
Africa/Nairobi
Africa/Ndjamena
Africa/Niamey
Africa/Nouakchott
Africa/Ouagadougou
Africa/Porto-Novo
Africa/Sao_Tome
Africa/Tripoli
Africa/Tunis
Africa/Windhoek
America/Adak
America/Anchorage
America/Anguilla
America/Antigua
America/Araguaina
America/Argentina/Buenos_Aires
America/Argentina/Catamarca
America/Argentina/Cordoba
America/Argentina/Jujuy
America/Argentina/La_Rioja
America/Argentina/Mendoza
America/Argentina/Rio_Gallegos
America/Argentina/Salta
America/Argentina/San_Juan
America/Argentina/San_Luis
America/Argentina/Tucuman
America/Argentina/Ushuaia
America/Aruba
America/Asuncion
America/Atikokan
America/Bahia
America/Bahia_Banderas
America/Barbados
America/Belem
America/Belize
America/Blanc-Sablon
America/Boa_Vista
America/Bogota
America/Boise
America/Cambridge_Bay
America/Campo_Grande
America/Cancun
America/Caracas
America/Cayenne
America/Cayman
America/Chicago
America/Chihuahua
America/Costa_Rica
America/Creston
America/Cuiaba
America/Curacao
America/Danmarkshavn
America/Dawson
America/Dawson_Creek
America/Denver
America/Detroit
America/Dominica
America/Edmonton
America/Eirunepe
America/El_Salvador
America/Fort_Nelson
America/Fortaleza
America/Glace_Bay
America/Goose_Bay
America/Grand_Turk
America/Grenada
America/Guadeloupe
America/Guatemala
America/Guayaquil
America/Guyana
America/Halifax
America/Havana
America/Hermosillo
America/Indiana/Indianapolis
America/Indiana/Knox
America/Indiana/Marengo
America/Indiana/Petersburg
America/Indiana/Tell_City
America/Indiana/Vevay
America/Indiana/Vincennes
America/Indiana/Winamac
America/Inuvik
America/Iqaluit
America/Jamaica
America/Juneau
America/Kentucky/Louisville
America/Kentucky/Monticello
America/Kralendijk
America/La_Paz
America/Lima
America/Los_Angeles
America/Lower_Princes
America/Maceio
America/Managua
America/Manaus
America/Marigot
America/Martinique
America/Matamoros
America/Mazatlan
America/Menominee
America/Merida
America/Metlakatla
America/Mexico_City
America/Miquelon
America/Moncton
America/Monterrey
America/Montevideo
America/Montserrat
America/Nassau
America/New_York
America/Nome
America/Noronha
America/North_Dakota/Beulah
America/North_Dakota/Center
America/North_Dakota/New_Salem
America/Nuuk
America/Ojinaga
America/Panama
America/Pangnirtung
America/Paramaribo
America/Phoenix
America/Port-au-Prince
America/Port_of_Spain
America/Porto_Velho
America/Puerto_Rico
America/Punta_Arenas
America/Rankin_Inlet
America/Recife
America/Regina
America/Resolute
America/Rio_Branco
America/Santarem
America/Santiago
America/Santo_Domingo
America/Sao_Paulo
America/Scoresbysund
America/Sitka
America/St_Barthelemy
America/St_Johns
America/St_Kitts
America/St_Lucia
America/St_Thomas
America/St_Vincent
America/Swift_Current
America/Tegucigalpa
America/Thule
America/Tijuana
America/Toronto
America/Tortola
America/Vancouver
America/Whitehorse
America/Winnipeg
America/Yakutat
America/Yellowknife
Antarctica/Casey
Antarctica/Davis
Antarctica/DumontDUrville
Antarctica/Macquarie
Antarctica/Mawson
Antarctica/McMurdo
Antarctica/Palmer
Antarctica/Rothera
Antarctica/Syowa
Antarctica/Troll
Antarctica/Vostok
Arctic/Longyearbyen
Asia/Aden
Asia/Almaty
Asia/Amman
Asia/Anadyr
Asia/Aqtau
Asia/Aqtobe
Asia/Ashgabat
Asia/Atyrau
Asia/Baghdad
Asia/Bahrain
Asia/Baku
Asia/Bangkok
Asia/Barnaul
Asia/Beirut
Asia/Bishkek
Asia/Brunei
Asia/Chita
Asia/Choibalsan
Asia/Colombo
Asia/Damascus
Asia/Dhaka
Asia/Dili
Asia/Dubai
Asia/Dushanbe
Asia/Famagusta
Asia/Gaza
Asia/Hebron
Asia/Ho_Chi_Minh
Asia/Hong_Kong
Asia/Hovd
Asia/Irkutsk
Asia/Jakarta
Asia/Jayapura
Asia/Jerusalem
Asia/Kabul
Asia/Kamchatka
Asia/Karachi
Asia/Kathmandu
Asia/Khandyga
Asia/Kolkata
Asia/Krasnoyarsk
Asia/Kuala_Lumpur
Asia/Kuching
Asia/Kuwait
Asia/Macau
Asia/Magadan
Asia/Makassar
Asia/Manila
Asia/Muscat
Asia/Nicosia
Asia/Novokuznetsk
Asia/Novosibirsk
Asia/Omsk
Asia/Oral
Asia/Phnom_Penh
Asia/Pontianak
Asia/Pyongyang
Asia/Qatar
Asia/Qostanay
Asia/Qyzylorda
Asia/Riyadh
Asia/Sakhalin
Asia/Samarkand
Asia/Seoul
Asia/Shanghai
Asia/Singapore
Asia/Srednekolymsk
Asia/Taipei
Asia/Tashkent
Asia/Tbilisi
Asia/Tehran
Asia/Thimphu
Asia/Tokyo
Asia/Tomsk
Asia/Ulaanbaatar
Asia/Urumqi
Asia/Ust-Nera
Asia/Vientiane
Asia/Vladivostok
Asia/Yakutsk
Asia/Yangon
Asia/Yekaterinburg
Asia/Yerevan
Atlantic/Azores
Atlantic/Bermuda
Atlantic/Canary
Atlantic/Cape_Verde
Atlantic/Faroe
Atlantic/Madeira
Atlantic/Reykjavik
Atlantic/South_Georgia
Atlantic/St_Helena
Atlantic/Stanley
Australia/Adelaide
Australia/Brisbane
Australia/Broken_Hill
Australia/Darwin
Australia/Eucla
Australia/Hobart
Australia/Lindeman
Australia/Lord_Howe
Australia/Melbourne
Australia/Perth
Australia/Sydney
Canada/Atlantic
Canada/Central
Canada/Eastern
Canada/Mountain
Canada/Newfoundland
Canada/Pacific
Europe/Amsterdam
Europe/Andorra
Europe/Astrakhan
Europe/Athens
Europe/Belgrade
Europe/Berlin
Europe/Bratislava
Europe/Brussels
Europe/Bucharest
Europe/Budapest
Europe/Busingen
Europe/Chisinau
Europe/Copenhagen
Europe/Dublin
Europe/Gibraltar
Europe/Guernsey
Europe/Helsinki
Europe/Isle_of_Man
Europe/Istanbul
Europe/Jersey
Europe/Kaliningrad
Europe/Kirov
Europe/Kyiv
Europe/Lisbon
Europe/Ljubljana
Europe/London
Europe/Luxembourg
Europe/Madrid
Europe/Malta
Europe/Mariehamn
Europe/Minsk
Europe/Monaco
Europe/Moscow
Europe/Oslo
Europe/Paris
Europe/Podgorica
Europe/Prague
Europe/Riga
Europe/Rome
Europe/Samara
Europe/San_Marino
Europe/Sarajevo
Europe/Saratov
Europe/Simferopol
Europe/Skopje
Europe/Sofia
Europe/Stockholm
Europe/Tallinn
Europe/Tirane
Europe/Ulyanovsk
Europe/Vaduz
Europe/Vatican
Europe/Vienna
Europe/Vilnius
Europe/Volgograd
Europe/Warsaw
Europe/Zagreb
Europe/Zurich
GMT
Indian/Antananarivo
Indian/Chagos
Indian/Christmas
Indian/Cocos
Indian/Comoro
Indian/Kerguelen
Indian/Mahe
Indian/Maldives
Indian/Mauritius
Indian/Mayotte
Indian/Reunion
Pacific/Apia
Pacific/Auckland
Pacific/Bougainville
Pacific/Chatham
Pacific/Chuuk
Pacific/Easter
Pacific/Efate
Pacific/Fakaofo
Pacific/Fiji
Pacific/Funafuti
Pacific/Galapagos
Pacific/Gambier
Pacific/Guadalcanal
Pacific/Guam
Pacific/Honolulu
Pacific/Kanton
Pacific/Kiritimati
Pacific/Kosrae
Pacific/Kwajalein
Pacific/Majuro
Pacific/Marquesas
Pacific/Midway
Pacific/Nauru
Pacific/Niue
Pacific/Norfolk
Pacific/Noumea
Pacific/Pago_Pago
Pacific/Palau
Pacific/Pitcairn
Pacific/Pohnpei
Pacific/Port_Moresby
Pacific/Rarotonga
Pacific/Saipan
Pacific/Tahiti
Pacific/Tarawa
Pacific/Tongatapu
Pacific/Wake
Pacific/Wallis
US/Alaska
US/Arizona
US/Central
US/Eastern
US/Hawaii
US/Mountain
US/Pacific
UTC
Save
Europe/Budapest
English (United Kingdom)
Deutsch (Deutschland)
English (United Kingdom)
English (United States)
Español (España)
Français (France)
Polski (Polska)
Português (Brasil)
Türkçe (Türkiye)
Монгол (Монгол)
Українська (Україна)
中文 (中国)
Login
GPU Day 2021
from
Wednesday, 10 November 2021 (08:50)
to
Thursday, 11 November 2021 (14:00)
Monday, 8 November 2021
Tuesday, 9 November 2021
Wednesday, 10 November 2021
09:00
Opening Talk and Welcome by the Director
Opening Talk and Welcome by the Director
09:00 - 09:20
09:20
Space-ready FPGA hardware acceleration for .NET software - Hastlayer
-
Erno David
(
MTA Wigner FK
)
Zoltán Lehóczky
(
Lombiq Technologies Ltd.
)
Space-ready FPGA hardware acceleration for .NET software - Hastlayer
Erno David
(
MTA Wigner FK
)
Zoltán Lehóczky
(
Lombiq Technologies Ltd.
)
09:20 - 09:40
Hastlayer (https://hastlayer.com/) by Lombiq Technologies is a .NET software developer-focused, easy-to-use high-level synthesis tool with the aim of accelerating applications. It converts standard .NET Common Intermediate Language (CIL) bytecode into equivalent Very High Speed Integrated Circuit Hardware Description Language (VHDL) constructs which can be implemented in hardware using FPGAs. After cloud-available FPGA platforms, we've made Hastlayer compatible with the Zynq 7000 family of FPGA SoC devices. The primary goal is to be able to utilize onboard computers of satellites built with the same hardware, readily available by NewSpace manufacturers. In this talk, we'll introduce Hastlayer and how it can be used, the challenges and experiences of making it compatible with Zynqs, and our results showing up to 2 orders of magnitude speed and power efficiency increases.
09:40
200+ GPUs in one HPC - available in months
-
Zoltan Kiss
(
KIFÜ
)
200+ GPUs in one HPC - available in months
Zoltan Kiss
(
KIFÜ
)
09:40 - 10:00
A new 5PF HPC is being built in Hungary, it will have more than 200 A100 GPUs. Dedicated partitions will be available for CPU-only jobs with almost 20 000 CPU cores, GPU partition with 200+ Nvidia A100 GPUs, Big Data partition with 9 TB RAM, AI partition with 8 GPU nodes. This will be completed with most advanced HPC software and portal system open for both SMEs and Academia. This talk would go into details of the new machine, and HPC Competence Centre offerings including future plans. We are ready to support Hungarian and International research including quantum simulators, apply for resources today!
10:00
Standards in HPC
-
Máté Ferenc Nagy-Egri
(
MTA Wigner FK
)
Standards in HPC
Máté Ferenc Nagy-Egri
(
MTA Wigner FK
)
10:00 - 10:40
10:40
Coffee Break
Coffee Break
10:40 - 11:00
11:00
The GUARDYAN code for high fidelity nuclear reactor calculations
-
David Legrady
(
Dr.
)
The GUARDYAN code for high fidelity nuclear reactor calculations
David Legrady
(
Dr.
)
11:00 - 11:30
GUARDYAN (GPU Assisted Reactor Dynamic Analysis) is a continuous energy Monte Carlo (MC) neutron transport code developed at Budapest University of Technology and Economics. It targets to solve time-dependent problems related to fission reactors with the main focus on simulating and analyzing short transients. The key idea of GUARDYAN is a massively parallel execution structure making use of advanced programming possibilities available on CUDA enabled GPUs. Compared to similar code systems GUARDYAN is the first to upscale to nuclear power plant levels targeting the simulation of analyzing severe accident scenarios for reactor safety analysis. Recent advances include the coupling with thermal-hydraulics solvers and comparison to actual measurements at the Paks Nuclear Power Plant.
11:30
Solving the Kuramoto Oscillator Model of Power Grids on GPU
-
Lilla Barancsuk
(
Budapest University of Technology and Economics
)
Solving the Kuramoto Oscillator Model of Power Grids on GPU
Lilla Barancsuk
(
Budapest University of Technology and Economics
)
11:30 - 11:50
Power grids are large complex networks whose dynamics, stability and vulnerability are intensively studied; new challenges arise with the increase of distributed renewable energy resources. The dynamics of electrical grids is highly affected by desynchronization between nodes, which can start an avalanche-like cascade of line failures causing massive outages. Modelling power systems in detail leads to an increased computational cost, as a much larger number of nodes (in the order of millions) needs to be dealt with than in the traditional power grid models. The Kuramoto model is a set of coupled nonlinear ordinary differential equations, that describes the power grid as an ensemble of coupled oscillators, and is widely used for investigating the synchronization properties of networks. The modelling of the power grid by the Kuramoto model consists in the solution of a system of such equations where each equation corresponds to a node in the power grid leading to a solution of a number of equations by the millions. To be able to efficiently handle the model, we numerically solved the second order Kuramoto equations on a GPU, and simulated cascades as threshold line failures. In this talk, we present our solution, where a special memory layout for the network graph has been introduced for effective implementation. We studied different numerical solvers supplied by *boost*’s *odeint* library, which we compared in terms of precision and performance.
11:50
Particle Simulation of Resonant Nanoantennas for Laser Driven Fusion
-
Istvan Papp
(
Wigner FK
)
Particle Simulation of Resonant Nanoantennas for Laser Driven Fusion
Istvan Papp
(
Wigner FK
)
11:50 - 12:10
Recently Nanoplasmonic Laser Induced Fusion Experiments were proposed, as an improvement in achieving laser driven fusion [1]. This combines recent discoveries in heavy-ion collisions and optics. The existence of detonations with time-like normal on space-time hyper-surfaces combined with absorption adjustment using nanoantennas allows the possibility of heating the target in an opposing laser beam setup [2]. For tracking the time evolution of non-equilibrium plasma interacting with strong laser fields, kinetic modeling is most proper way. However, to describe the absorption effects of gold nanoantennas inside a medium, one requires different approaches. Here we will present a particle-in-cell model of resonant nanoantennas using the capabilities of the EPOCH multi-component PIC code[3]. [1] L.P. Csernai, N. Kroó, & I. Papp, Radiation-Dominated Implosion with Nano-Plasmonics, Laser and Particle Beams 36, 171-178 (2018). [2] L.P. Csernai, M. Csete, I.N. Mishustin, A. Motornenko, I. Papp, L.M. Satarov, H. Stöcker & N. Kroó, Radiation-Dominated Implosion with Flat Target, Physics and Wave Phenomena, 28 (3) 187-199 (2020) in press, accepted February 3, 2020, (arXiv:1903.10896v3). [3] T. D. Arber, et. al. Contemporary particle-in-cell approach to laser-plasma modelling Plasma Phys. Control. Fusion 57, 113001 (2015)
12:10
Accelerating Tridiagonal Solvers
-
István Reguly
(
PPKE ITK
)
Accelerating Tridiagonal Solvers
István Reguly
(
PPKE ITK
)
12:10 - 12:30
In this talk, we present work recently done by our group on the parallel solution of multiple tridiagonal linear systems that typically arise during the solution of discretised partial differential equations. We briefly introduce the established serial (Thomas) and parallel (Parallel Cyclic Reduction) algorithms for individual systems, then discuss how multiple systems are formed and solved in a high-dimensional system - including shared memory, distributed memory, and pipeline parallelism, targeting recent many-core CPUs, GPUs and FPGAs. We demonstrate scalability up to 16k CPU cores or 32 GPUs for large systems representative of CFD applications. We also study computational and energy efficiency on GPUs and FPGAs on smaller problems representative of applications in computational finance, demonstrating that a Xilinx Altevo U280 can closely match an NVIDIA V100 GPU in terms of throughput, and significantly outperform it in terms of energy efficiency.
12:30
Lunch break
Lunch break
12:30 - 14:00
14:00
Implementing Hierarchical Bayesian Networks on the GPU
-
László Dobos
(
Wigner FK
)
Implementing Hierarchical Bayesian Networks on the GPU
László Dobos
(
Wigner FK
)
14:00 - 14:30
Designing spectroscopic follow-up observations in astronomy poses several challenges. Observing spectra is significantly more time consuming than photometric imaging observations yet, interesting objects need to be selected based on images taken with only a few broad-band filters. Hierarchical Bayesian Networks are often used to estimate physical parameters of photometrically observed stars, a prerequisite to successful spectroscopic targeting. We present an implementation of a novel Bayesian model which can be used to fit parameters of mixtures of stellar populations to derive physical parameters as well as population membership probabilities for each star. Since the Adaptive Monte Carlo method used to integrate the model is computationally expensive, we rely heavily on GPUs.
14:30
AI application in stellar spectroscopy
-
Viska Wei
AI application in stellar spectroscopy
Viska Wei
14:30 - 15:00
Artificial Neural Networks have been applied in many fields of science and are particularly successful in image processing. Here we outline the challenges of Deep Learning in stellar spectroscopy, since stellar spectra are fundamentally different from images. Although only one-dimensional, spectra show no translation invariance and important features appear on all scales: While the surface temperature of a star can be told either from the overall shape of the spectrum or the strengths of certain easily detectable absorption lines, other physical parameters, such as chemical element abundances, are encoded in many small features scattered at a multitude of wavelengths. We also consider applications other than physical parameter inference: denoising and normalization with autoencoders and surrogate modelling with generative networks.
15:00
Accelerating the solution of large number of delay differential equations with GPUs
-
Dániel Nagy
(
Budapest University of Technology and Economics, Department of Hydrodynamic Systems
)
Accelerating the solution of large number of delay differential equations with GPUs
Dániel Nagy
(
Budapest University of Technology and Economics, Department of Hydrodynamic Systems
)
15:00 - 15:20
Delay differential equations (DDE) appear in several branches of science and engineering. Possible applications include the modelling and forecasting of epidemics, the modelling of stability loss in control systems and many more. The delays in the differential equations can be caused by the incubation time of a virus in epidemic models or by the time which the computer needs to carry out the necessary calculation in case of computer control. The numerical solution of problems described by DDEs is inevitable in several cases, and sometimes a large number of the same delay differential equations must be solved due to the many possible parameter combinations or initial conditions. **The serial solution of millions of equations is usually not viable**, thus some kind of parallelization is necessary; however, **general purpose DDE solvers for GPUs do not exist at present**. Compared to ordinary differential equation (ODE) solvers, DDE solvers use values from the past because of the delay in the equation; thus, every timestep must be saved. However, it cannot be guaranteed that the required previous time instance is available in the global memory. Therefore, interpolation between the past values is necessary to maintain the order of the used numerical method. GPUs were found extremely efficient for the numerical solution of large number of ODEs. **The most efficient method for the parallelization of numerical solution is assigning each equation to one thread** (also called *per-thread* approach). In the present work, the same strategy is applied for the acceleration of DDE solvers. The traditional 4th order fixed-timestep Explicit—Runge—Kutta (ERK) method is implemented with the extension of 3rd order Hermite-interpolation. The code is written in CUDA C++ language. This solver can be applied for a wide range of possible problems on most GPUs. However, an efficient implementation is difficult since it requires the intensive use of global memory, while the goal is to reach the maximum possible FLOP efficiency. The bottleneck of such problems is the memory bandwidth; thus, finding an optimal memory structure is necessary. Furthermore, the global memory size can also be a limiting factor, as each thread saves thousands of states in the global memory. This can lead to several gigabytes of global memory usage limiting the total number of residing threads and throttling the performance. In the study, the basic idea of the solver is shown. **An efficient memory structure is proposed and tested, with aligned and coalesced memory access pattern**. To minimise the global memory usage, each thread works on a fixed-size array, when this array fills up, the older saved timesteps will be overwritten in a circular fashion. Problem specific codes with the previously described structure are tested on several simple test cases, and the FLOP efficiency, memory load/store efficiency and further important metrics are measured. **It is found that 40% FLOP efficiency can be reached with 95% memory efficiency.** The FLOP efficiency of the problem specific codes can be regarded as the maximal possible efficiency of my proposed solver. The implementation of a general purpose DDE solver is presented in the already existing GPU ODE solver called MPGOS and the metrics of the implementation is measured. A general solver must work for every possible problem, which requires extra computations and memory operations; consequently, its efficiency in my case only reaches half of the problem specific solvers. Finally, the limitations of a fixed-timestep ERK method is presented, and the possibilities of adaptive ERK solvers for GPUs is discussed. However, for adaptive ERK methods, aligned and coalesced memory access patterns are not possible with the *per-thread* approach, thus a heterogeneous CPU-GPU solver is proposed which may be a viable solution for later implementations.
15:20
Mixed precision: when is it worth it?
-
Bálint Siklósi
(
Pázmány Péter Catholic University - Hungary
)
Mixed precision: when is it worth it?
Bálint Siklósi
(
Pázmány Péter Catholic University - Hungary
)
15:20 - 15:40
Mixing different precision of floating point arithmetics and number representations may be a highly effective tool to tackle some main challenges of exascale computing. By lowering precision, we can reduce memory and network traffic, decrease memory footprint, we can achieve more floating point operations per second by using less time to compute the same operations and we can also reduce energy consumption. Using recently introduced hardware features, the benefit can become even larger. NVIDIA Tensor Cores provide 2.5X speed-up in HPC by enabling mixed-precision computing, but they can also provide 10X speed-up in AI training with their 32-bit and 16-bit Tensor Float support. Using FPGAs with half precision the advantage is further increased, since the operating area may decrease as well, and the frequency of the device may increased. On the flip side, changing the representation also degrades accuracy, so mixed representation can only be used with careful consideration, making it even more difficult to apply automatically. In 2017, a group of NVIDIA researchers published a study detailing how to reduce the memory requirements for neural network training with a technique called Mixed Precision Training. Weights, activations, and gradients are stored in IEEE FP16 format, but in order to match the accuracy of the FP32 networks, FP32 master copies of the weights are maintained. During one training step, the forward and backward passes are calculated using FP16 arithmetics, while the optimizer step and weight update are calculated using FP32 arithmetics. To avoid the underflow of the gradient, they also introduce a loss scaling scheme, whereby the loss and therefore the gradient is scaled up by a constant factor. The GPUMixer (best paper at ISC 2019) is a performance-driven automatic tuner for GPU kernels. It uses static analysis for finding a set of operations (FISet) to execute in lower precision, while data entering and leaving those sets are in high precision. They try to maximize the ratio of low precision arithmetics and type casting operations to achieve better performance. Also they apply "shadow" execution to determine the error and maintain a prescribed error bound. In our work, we want to achieve a similar automatic mixed-precision execution on unstructured mesh computations, using the OP2 domain specific language. The advantage of this system is that we can exploit further domain knowledge instead of focusing on an individual kernel. If we find a variable which acts like an accumulator, then we should keep it in higher precision. If we find one that stores only differences, then we can lower its precision. As an example, we measured mixed-precision execution on the Airfoil application (an industrially representative CFD code which is a finite volume simulation that solves the 2D Euler equations): using two NVIDIA V100 GPUs the speed-up is 1.11X (using all FP32 it would be 1.44X), and using 64 INTEL Xeon processors the speedup is 1.13X (using all FP32 it would be 1.76X).
15:40
Coffee break
Coffee break
15:40 - 16:00
16:00
Laboratory observation of water surface polygon vortices
-
Adam Kadlecsik
Laboratory observation of water surface polygon vortices
Adam Kadlecsik
16:00 - 16:20
It is a known phenomenon, when a filled bucket is rotated around its axis the water surface takes up a paraboloid shape. A less trivial instance is when only the bottom of the bucket rotates, and the walls are stationary. In this case between the liquid near the rotating bottom and the stationary wall a velocity shear emerges creating rotating polygon-like shapes. We reproduce this phenomenon and build a physical understanding around it.
16:20
Hydrolysis of N,N-dimethylindole-3-ethaniminium cation, the oxidized form of the endogenous psychedelic N,N-dimethyltryptamine
-
Károly Kubicskó
(
ELTE Faculty of Science, Institute of Chemistry
)
Hydrolysis of N,N-dimethylindole-3-ethaniminium cation, the oxidized form of the endogenous psychedelic N,N-dimethyltryptamine
Károly Kubicskó
(
ELTE Faculty of Science, Institute of Chemistry
)
16:20 - 16:40
The monoamine oxidase (MAO) is a flavoenzyme, which performs the oxidation of monoamine neurotransmitters such as serotonin, dopamine, norepinephrine, and their structurally related neuromodulator compounds, usually called "trace amines" (TAs) referring to their lower concentration compared to the main neurotransmitters. The latter group includes tryptamine (T), and phenylethylamine (PEA) as well as their derivatives. They have not received too much scientific interest before the discovery of G protein coupled human trace amine associated receptors (TAARs). The irregularities of TA levels has been linked to numerous mental disorders like schizophrenia, major depression, bipolar disorder, anxiety, attention deficit hyperactivity disorder (ADHD), and substance abuse disorders. The MAO has two isoforms, MAO-A and MAO-B. Their primary structure (sequence of amino acids) share around 70% identity, but their distribution in tissues and their selectivity to substrates is different. MAOs have crucial role in the breakdown/inactivation of monoamine compounds in the body, therefore they responsible for the regulation their levels. A compound belonging to the TA group, N,N-dimethyltryptamine (DMT) is a naturally occurring serotonergic indole alkaloid, which has profound psychedelic (mind-altering) effects on the human psyche. Lately, it has been discovered, that DMT is a natural ligand of sigma-1 receptors and it has important role in tissue protection, regeneration, and immunity. In vitro experiments revealed that DMT shows potent protective effects against hypoxia. We have investigated the metabolism of DMT with monoamine oxidase A enzyme using multilayer QM:MM quantum chemical calculations. The MAO converts DMT into a positively charged iminium ion form, namely N,N-dimethylindole-3-ethaniminium cation (imDMT$^+$). In order to examine the metabolism process of endogenous DMT further, we decided to study the hydrolysis of imDMT$^+$ in detail, which resulting indole-3-acetaldehyde (IAL) and dimethylamine. Three different systems (or reaction paths) were examined, which include the imDMT$^+$ cation and one OH$^-$ ion with zero ($R_0$), one ($R_1$), and two H$_2$O molecules ($R_2$) respectively. The largest, 2 H$_2$O containing system is shown in \figurename\ \ref{fig1}. Our results demonstrate that the presence of water molecule(s) open the possibility for an intermolecular proton transfer in the third step of the reaction (\figurename\ \ref{fig2}) and dramatically reduces the corresponding barriers ($R_1$,$R_2$) compared to the intramolecular ($R_0$) case.
16:40
Parallel proton CT image reconstruction
-
Akos Sudar
(
MTA Wigner FK
)
Parallel proton CT image reconstruction
Akos Sudar
(
MTA Wigner FK
)
16:40 - 17:00
Modern proton Computed Tomography (pCT) images are usually reconstructed by the algebraic reconstruction techniques (ART). The Kaczmarz-method and its variations are among the most used methods, which are iterative solution techniques for linear problems with sparse matrices. One can ask, whether statistically-motivated iterations, which have been successfully used for emission tomography, can be applied to reconstruct pCT images as well. In my research, I developed a method, based on the Richardson–Lucy deconvolution -- as a statistically-motivated fixed point iteration. I implemented this algorithm to a parallel GPU code, with spline based trajectory calculation and on-the-fly system matrix generation. My results presented that the method works well, and it can be successfully applied in pCT applications.
17:00
THe challenges and methods of tuning the HIJING++ Monte Carlo event generator
-
Balázs Majoros
THe challenges and methods of tuning the HIJING++ Monte Carlo event generator
Balázs Majoros
17:00 - 17:20
17:20
AlphaFold2 transmembrane protein structure prediction shines
-
Tamas Hegedus
(
Semmelweis University
)
AlphaFold2 transmembrane protein structure prediction shines
Tamas Hegedus
(
Semmelweis University
)
17:20 - 18:00
Transmembrane (TM) proteins are major drug targets, indicated by the high percentage of prescription drugs acting on them. For a rational drug design and an understanding of mutational effects on protein function, structural data at atomic resolution are required. However, hydrophobic TM proteins often resist experimental structure determination and in spite of the increasing number of cryo-EM structures, the available TM folds are still limited in the Protein Data Bank. Recently, the DeepMind’s AlphaFold2 machine learning method greatly expanded the structural coverage of sequences, with high accuracy. Since the employed algorithm did not take specific properties of TM proteins into account, the validity of the generated TM structures should be assessed. Therefore, we investigated the quality of structures at genome scales, at the level of ABC protein superfamily folds, and also in specific individual cases. We tested template-free structure prediction also with a new TM fold, dimer modeling, and stability in molecular dynamics simulations. Our results strongly suggest that AlphaFold2 performs astoundingly well in the case of TM proteins and that its neural network is not overfitted. We conclude that a careful application of its structural models will advance TM protein associated studies at an unexpected level. URL: http://alphafold.hegelab.org Acknowledgements: Cystic Fibrosis Foundation: HEGEDU20I0 and NRDIO: K127961(TH); CCF LUKACS20G0, CIHR, CFI and Canada Research Chair Program (GLL) Swiss National Funds 310030_197563 (MG). Thanks to https://hpc.kifu.hu, https://www.mpibpc.mpg.de/grubmueller, http://gpu.wigner.mta.hu.
Thursday, 11 November 2021
09:00
Social biases in AI
-
Balázs Keszthelyi
(
TechnoLynx Ltd.
)
Social biases in AI
Balázs Keszthelyi
(
TechnoLynx Ltd.
)
09:00 - 09:40
Fairness in AI is a constantly evolving from a regulatory point of view, but the need of attention on this topic has been painstakingly clear after incidents in the past few years. In our presentation we are going to summarize examples of gender and racial bias in AI systems, as well as we are touching upon the latest regulatory trends including ALTAI. We are going to discuss some best practices of bias mitigation but also challenges of detection and clear definition
09:40
20 Years of Static Dataflow
-
Oskar Mencer
20 Years of Static Dataflow
Oskar Mencer
09:40 - 10:20
10:20
Boson sampling simulation enhanced by FPGA based data-flow engines
-
Peter Rakyta
(
Department of Physics of Complex Systems, Eötvös Loránd University
)
Boson sampling simulation enhanced by FPGA based data-flow engines
Peter Rakyta
(
Department of Physics of Complex Systems, Eötvös Loránd University
)
10:20 - 10:40
As was shown by the pioneering work of Scott Aronson and Alex Arkhipov, bosonic systems are promising candidates to demonstrate quantum advantage. Due to the nature of quantum states describing indistinguishable bosons, the exact simulation of particle number resolved bosonic systems is computationally very hard. One of the main objectives of the Laboratory of Quantum Computer Simulators in Budapest (launched in the collaboration of the Department of Physics of Complex Systems, the Department of Programming Languages and Compilers of the Eötvös Loránd University and the Department for Computational Sciences of Wigner Research Centre for Physics) is to develop new methods to make the simulation of these systems more efficient. According to our recent experiences FPGA based data-flow engines (DFE's) seem to be promising architectures to enhance the simulation of bosonic systems on classical hardware. On platforms supporting data-flow programming model one has an instant access to data generated during the computational process without the overhead of passing the data between the memory and central processing units (CPU's). In particular, we argue that DFE's are suitable to evaluate matrix functions associated with the simulation of different variants of Boson Sampling with high precision. Such a special matrix function is the permanent of a squared matrix. In the talk I will present our DFE implementation to calculate the permanent of a unitary matrix describing a bosonic quantum interferometer using 128 bit fixed point arithmetics. We provide a benchmark of our implementation to calculate the permanent of a matrix up to a size of 28x28 on a single FPGA chip, and up to a matrix size of 40x40 on a dual FPGA chip configuration. Our results outperforms previous benchmarks of permanent calculation both in performance and in numerical precision. We incorporated our DFE permanent calculator into the Piquasso bosonic quantum computer simulator.
10:40
Coffee Break
Coffee Break
10:40 - 11:00
11:00
CERN Quantum Technology Initiative unveils strategic roadmap shaping CERN’s role in next quantum revolution
-
Michele Grossi
CERN Quantum Technology Initiative unveils strategic roadmap shaping CERN’s role in next quantum revolution
Michele Grossi
11:00 - 11:40
11:40
Application of Machine Learning tools in heavy-ion collisions at the Large Hadron Collider
-
Neelkamal Mallick
(
IIT Indore
)
Application of Machine Learning tools in heavy-ion collisions at the Large Hadron Collider
Neelkamal Mallick
(
IIT Indore
)
11:40 - 12:00
12:00
Introduction to photonic quantum machine learning
-
Dániel Nagy
(
Wigner Research Centre for Physics
)
Introduction to photonic quantum machine learning
Dániel Nagy
(
Wigner Research Centre for Physics
)
12:00 - 12:30
Possibly the most influential achievements of modern computer science are the inventions of different machine learning algorithms, especially deep neural networks, which were able to solve problems that were previously intractable for computers, for example recognizing different animals on photos. On the other hand, in the last few decades we were witnessing an enormous improvement in quantum computing, especially quantum hardware developement. Combining classical machine learning methods with the power of quantum computing gives rise to a new field called quantum machine learning. We present a few quantum machine learning algorithms, which use the continuous-variable paradigm of quantum computing.
12:30
Lunch
Lunch
12:30 - 14:00