The First Workshop on Photorealistic Image and Environment Synthesis for Multimedia Experiments (PIES-ME)
Photorealistic media aim to faithfully represent the world, creating an experience which is perceptually
indistinguishable from a real world experience. Current standard media applications fall short from this goal
since acquisition and production technologies in the consumer applications do not capture/produce enough
of the world’s visual, audio, spatial, and temporal information to faithfully represent it. In the last years,
however, the area of photorealistic media has seen a lot of activity, with new multimedia areas emerging,
such as light fields, point clouds, ultra-high definition, high frame rate, high dynamic range imaging, and
novel 3D audio and sound field technologies. The combination of these technologies can certainly contribute
to pave the way for hyper-realistic media experience. But first, we have to overcome several technological
challenges. It is worth pointing out that research in this area requires the use of big datasets, software
tools, and powerful infrastructures. Among these, the availability of meaningful datasets, with a diverse and
high-quality content, is of significant importance.
In recent years, the number of vision-based datasets has quickly grown. Some of these datasets provide
photorealistic image sequences created by physically capturing real-world environments which are limited
to one or two types of images (e.g., monocular and depth) and small sets of images. To address the limitations
of capturing photorealistic datasets of real-world environments, researchers have begun to render
image sequences of synthesized virtual environments which allow for more types of images (e.g., monocular,
stereoscopic, depth, and semantic) and often include much larger sets of images. However, many of these
synthetic datasets are not photorealistic due to relying on lower-fidelity virtual objects and/or rasterizationbased
rendering techniques. 360° VR datasets have also grown, but most include 360° videos captured by
diverse camera hardware and curated from various Internet sources, with varying resolutions, content, and
camera motions. In summary, most available datasets are limited and do not provide researchers adequate
tools to advance the area of photorealistic applications.
The goal of this workshop is to engage experts and researchers on the synthesis of photorealistic images
and/or virtual environments, particularly in the form of public datasets, software tools, or infrastructures,
for multimedia research. Such public datasets, software tools, and infrastructures will lower entry barriers
by enabling researchers that lack expensive hardware (e.g., complex camera systems, smart glasses, robots,
autonomous vehicles) to simulate and create datasets representative of such hardware and various scenarios.
Photorealistic image and environment synthesis can benefit multiple research areas in addition to multimedia
systems, such as machine learning, robotics, computer vision, mixed reality, and virtual reality.
Important Dates
- Paper Submission: July 22, 2022, 11:59 pm Anywhere on Earth (AoE)
- Notification of Acceptance:
August 7, 2022 August 10, 2022
- Camera-ready version:
August 21, 2022, 11:59 pm EDT August 14, 2022, 11:59 pm EDT
Supporters
Keynote Speechers
Collaborative Machine Intelligence: Promoting Energy-Efficient IoT Sensing &Edge Analytics
Abstract
To support real-time &sustainable machine intelligence that exploits the rapid growth in sensor
deploymentsin urban spaces (e.g., video, audio) and on wearable devices (e.g., inertial, radar),
there is a need to optimize the execution of machine learning (ML) pipelines on
resource-constrained embedded devices. To this end, this talkshall describe the vision of
collaborative machine intelligence, where the sensing and inferencing pipelines on individual
wearable and IoT devices collaborate, in real-time, to overcome such resource limitations.
First, I will describe work on tightly coordinated IoT+ wearable sensing, which allows the
ultra-low power (even battery-less) capture of fine-grained human gestural activities in various
environments (e.g., in offices and in gyms) by combining IoT sensors &wearable devices. Second,
using a sample video surveillance application, I will describe how IoT-based collaborative
machine inferencingcan provide dramatic reductions in energy and latency, as well as
improvements in accuracy. To practically realize this vision, I shall finally argue why edge
computing needs to evolve, from its current focus on pure local computation offloading to a
``Cognitive Edge” platform that enables such collaborative and trusted sense-making across
heterogeneous pervasive devices.
Bio
Archan Misra is Professor, and the Associate Dean of Research, in the School of Information
Systems at Singapore Management University (SMU). He is the Director of SMU’s Center for
Applied Smart-Nation Analytics (CASA), which is developing pervasive technologies for smart
city infrastructure and applications.Archan has led a number of multi-million dollar,
large-scale research initiatives at SMU, including the LiveLabs research center, and is a
recent recipient of the prestigious Investigator grant (from Singapore’s National Research
Foundation) for sustainable man-machine interaction intelligence. Over a 20+ year research
career spanning both academics and industry (at IBM Research and Bellcore), Archan has
published on, and practically deployed, technologies spanning wireless networking, mobile
&wearable sensing and urban mobility analytics. His current research interests lie in
ultra-low energy execution of machine intelligence algorithms using wearable and IoT
devices. Archan holds a Ph.D. from the University of Maryland at College Park, and chaired
the IEEE Computer Society's Technical Committee on Computer Communications (TCCC) from
2005-2007.
Dynamic Watermarking for Security of Cyber-Physical Systems
Abstract
The coming decades may see the large scale deployment of networked cyber–physical systems to
address global needs in areas such as energy, water, health care, and transportation. However,
as recent events have shown, such systems are vulnerable to cyber attacks. We begin by
revisiting classical linear systems theory, developed in more innocent times, from a
security-conscious, even paranoid, viewpoint. Then we present a general technique, called
"dynamic watermarking," for detecting any sort of malicious activity in networked systems of
sensors and actuators. We present a field test on an automobile, experimental demonstration of
this technique on an automobile on a test track, an experimental process control system, and a
simulation study of defense against an attack on Automatic Gain Control (AGC) in a synthetic
four area power system.
[Joint work with Bharadwaj Satchidanandan, Jaewon Kim, Woo Hyun Ko, Tong Huang, Gopal Kamath,
Lantian Shangguan, Kenny Chour, Le Xie, and Swaminathan Gopalswamy].
Bio
P. R. Kumar’s current focus includes Cyber-physical Systems, Security, Privacy, Unmanned
Aerial System Traffic Management, 5G, Wireless Networks, Machine Learning, and Power
Systems. Hestudied at IIT Madras and Washington Univ., St. Louis.He servedinthe Math
DeptatUMBC (1977-84), and ECEand CSL atUIUC(1985-2011). He is currently atTexas A&M
Univ., where he is a University Distinguished Professor, a Regents Professor,and holds
the College of Engineering Chair in Computer Engg. He is a member of the U.S. NAE, The
World Academy of Sciences, and Indian NAE. He was awarded a Doctor Honoris Causa by ETH.
He received the IEEE Field Award for Control Systems, Eckman Award of AACC, Ellersick
Prize of IEEE ComSoc, Outstanding Contribution Award of ACM SIGMOBILE, Infocom
Achievement Award, and SIGMOBILE Test-of-Time Paper Award. He is a Fellow of IEEE and
ACM. He is a Gandhi Distinguished Visiting Professor at IIT Bombay, a Honorary Professor
at IIT Hyderabad, andwas Leader of a Guest Chair Professor Group at Tsinghua Univ., He
was awarded a Distinguished Alumnus Award from IIT Madras, Alumni Achievement Award from
WashU, and Drucker Eminent Faculty Award from UIUC.
Living at the Edge: Designing Accurate and Efficient Visual Sensing Systems
Abstract
Semantically-rich visual information, generated by surveillance cameras or those on mobile
devices, as well as by 3D sensors that provide depth perception (like LiDAR and stereo-cameras),
is available aplenty at the network's edge. However, these sensors have limited communication
bandwidth to the rest of the network, and sometimes limited on-board compute. This talk will
cover experiences drawn from, and draw out common design patterns that arise in, designing
systems that overcome these challenges to realize a range of interesting capabilities: extended
vehicular vision, real-time visual map updates, cross-camera complex activity detection, and
visual analytics in retail settings.
Bio
Ramesh Govindan is the Northrop Grumman Chair in Engineering and Professor of Computer
Science and Electrical Engineering at the University of Southern California. He received
his B. Tech. degree from the Indian Institute of Technology at Madras, and his M.S. and
Ph.D. degrees from the University of California at Berkeley. His research interests
include routing and measurements in large internets, networked sensing systems, and
mobile computing systems.
Part I: "NSF Funding Opportunities in Advanced Cyberinfrastructure"
Part II: "Exposed Buffer Processing: Spanning the Continuum from HPC to Edge"
Abstract
In every form of digital store-and-forward communication, intermediate forwarding nodes are
computers, with attendant memory and processing resources. For more than 30 years this has
stimulated efforts to create a wide-area infrastructure that goes beyond simple forwarding to
create a platform that makes more general and varied use of increasingly powerful and plentiful
node resources. There have been analogous pressures toward active and networked storage,
processor-in-memory and streaming processors. There is a widespread consensus that it should be
possible to define and deploy a converged wide-area platform that combine these silos seamlessly
and universally. However a great deal of investment in research prototypes has yet to produce a
credible candidate architecture. Drawing on design analysis, historical examples, and case
studies, this talk presents an argument for the hypothesis that in order to realize a
distributed system with the kind of convergent generality and deployment scalability that might
qualify as "future-defining," we must build it from a small set of simple, generic, and limited
abstractions of the low level resources (processing, storage and network) of its constituent
nodes. The common building block out of which these silos are constructed are storage or memory
buffers/blocks and a set of primitive allocation and processing operations on them.
Bio
Micah Beck began his research career in distributed operating systems at Bell
Laboratories and received his Ph.D. in Computer Science from Cornell University (1992)
in the area of parallelizing compilers. He then joined the faculty of the Computer
Science Department at the University of Tennessee, where he is currently an Associate
Professor working in distributed high performance computing, networking and storage.
Program
All the events will take place in the room "Pav4 R1.06".
The times are in Western European Summer Time (UTC/GMT+1).
Friday, October 14
9:15 – 9:30 Welcome and Opening Remarks
9:30 – 10:00 Delving into Light-Dark Semantic Segmentation for Indoor Scenes Understanding
Xiaowen Ying (Lehigh University), Bo Lang (Lehigh University), Zhihao Zheng (Lehigh University), Mooi Choo Chuah (Lehigh University)
10:00 - 10:30 Language-guided Semantic Style Transfer of 3D Indoor Scenes
Bu Jin (University of Chinese Academy of Sciences), Beiwen Tian (Tsinghua University), Hao Zhao (Peking University), Guyue Zhou (Tsinghua University)
10:30 - 11:00 Coffee-Break
11:00 - 11:30 Towards a Calibrated 360 Stereoscopic HDR Image Dataset for Architectural Lighting Studies
Michèle Atié (Nantes Université), Toinon Vigier (Nantes Université), François Eymond (Université de Lyon), Céline Drozd (Nantes Université), Raphaël Labayrade (Université de Lyon), Daniel Siret (Nantes Université), Yannick Sutter (Nantes Université)
11:30 - 12:00 I-SPIES Dataset Reveal
Ryan P. McMahan (University of Central Florida)
12:00 - 13:00 Panel Discussion: "Photorealistic datasets to enable multimedia research: creation, curation and use"
13:00 - 14:00 Lunch
14:00 - 15:00 Keynote Speech: Hyper-Realistic and Immersive Imaging for Enhanced Quality of Experience
Frederic Dufaux (Université Paris-Saclay, CNRS, CentraleSupélec)
15:00 - 15:30 Subjective Study of the Impact of Compression, Framerate, and Navigation Trajectories on the Quality of Free-Viewpoint Video
Jesús Gutiérrez (Universidad Politécnica de Madrid), Adriana Galán (Universidad Politécnica de Madrid), Pablo Pérez (Nokia), Daniel Corregidor (Universidad Politécnica de Madrid), Teresa Hernando (Universidad Politécnica de Madrid), Javier Usón (Universidad Politécnica de Madrid), Daniel Berjón (Universidad Politécnica de Madrid), Julián Cabrera (Universidad Politécnica de Madrid), Narciso García (Universidad Politécnica de Madrid)
15:30 - 16:00 Comparative Evaluation of Temporal Pooling Methods for No-Reference Quality Assessment of Dynamic Point Clouds
Pedro G. Freitas (University of Brasília), Giovani D. Lucafo (University of São Paulo), Mateus Gonçalves (University of Brasília), Johann Homonnai (University of Brasília), Rafael Diniz (University of Brasília), Mylène C.Q. Farias (University of Brasília)
16:00 - 16:15 Closing Remarks