Dia 25, às 16h, acontecerá o seminário “The Two Cultures of Artificial Intelligence“, proferido pelo professor Philip Wadler.
Seminário da Pós: “The Two Cultures of Artificial Intelligence”
Resumo do Seminário: Everyone is talking about new advances in Artificial Intelligence (AI): texts written by ChatGPT, images drawn by Midjourney, and self-driving cars from Tesla. When I was a sophomore I learned the fundamentals of my subject from John McCarthy, founder of AI and a pioneer of programming. In the earl days, AI debated the merits of two complementary methods: logic vs heuristics. Typical of the first is proving properties of programs, which became my research interest. Typical of the second is machine learning, the foundation of ChatGPT, Midjourney, and self-driving.
This talk will contrast the two approaches, discussing the benefits and risks of each, and how the first may curb shortcomings of the second.Artists and writers are worried that AI will put them out of a job. One of the next professions on the list is programmers. Already, ChatGPT and related systems can do a credible job of generating simple programs, such as code for web pages. However, also already, such systems have demonstrated that they routinely write code containing known security bugs.
One possible scenario is that heuristic techniques will prove as adequate as humans—and far cheaper—at simple tasks, putting writers, artists, and programmers out of work. Bereft of new data to learn from, the machine learning applications will then fall into stagnation. They will be fine at producing articles, art, and code close to what has been produced before, but unable to produce anything original. And by then there may no longer be writers, artists, or programmers to hire, as who would study for a profession where no one can find work because they’ve been displaced by machines?
A different scenario is to pass laws to ensure that writers and artists are fairly recompensed when AI generates artifacts based on their work. Regarding code, the logical techniques have shown they can vastly improve reliability. Synthesising logical and heuristic techniques may lead to code that is both cheaper and more reliable. Programmers would shift from writing code to writing logical specifications, with AI helping to generate code proved to meet those specifications.
Conheça o Professor: Philip Wadler is Professor of Theoretical Computer Science at the University of Edinburgh and Senior Research Fellow at IOHK. He is a Fellow of the Royal Society, a Fellow of the Royal Society of Edinburgh, and an ACM Fellow. He is head of the steering committee for Proceedings of the ACM, past editor-in-chief of PACMPL and JFP, past chair of ACM SIGPLAN, past holder of a Royal Society-Wolfson Research Merit Fellowship, winner of the SIGPLAN Distinguished Service Award, and a winner of the POPL Most Influential Paper Award. He has an h-index of over 70 with more than 25,000 citations to his work, according to Google Scholar. He contributed to the designs of Java and XQuery, and is co-author of Introduction to Functional Programming (Prentice Hall, 1988), XQuery from the Experts (Addison Wesley, 2004), Generics and Collections in Java (O’Reilly, 2006), and Programming Language Foundations in Agda (2018). He is a principal designer of the Haskell programming language, contributing to its two main innovations, type classes and monads. The YouTube video of his Strange Loop talk Propositions as Types has over 100,000 views.
Para acompanhar o seminário, acesse: https://youtube.com/live/FQI_9Yb6kik
Dia 15 de março acontecerá o seminário “ Bancos de Dados e Redes Sociais Digitais”, proferido pelo professor Sérgio Lifschitz.
Seminário da Pós: “Banco de Dados e Redes Sociais Digitais”
Resumo do Seminário: A comunicação por meio das chamadas Redes Sociais Digitais (ou Online) é parte importante do dia a dia da nossa sociedade. A observação e a análise dos dados nas RSDs reflete, de maneira significativa, o comportamento e o posicionamento das pessoas no cotidiano offline. Apesar das RSDs serem exemplos clássicos de sistemas de Big Data por conta dos grandes volumes de dados e também da velocidade de disseminação dos mesmos, é fato que sistemas de bancos de dados, relacionais ou NoSQL, são pouco ou nunca usados pelos grupos de pesquisa. Motivado por essa constatação, pretendo nessa apresentação mostrar como a grande área de dados (engenharia, ciência e bancos de dados) pode contribuir para investigações científicas e tecnológicas relevantes. A ênfase será dada nos trabalhos já realizados, ou em execução, pelo time de pesquisadores (alunos e colaboradores) do Laboratório BioBD do DI PUC-Rio.
Conheça o Professor: Sérgio Lifschitz é professor do quadro principal do DI e coordenador do Laboratório BioBD na PUC-Rio. Doutor em Informática pela ENST/Télécom Paris, França, com mestrado e graduação em Engenharia Elétrica, ambos pela PUC-Rio. Pesquisador na área de engenharia, ciência e bancos de dados com ênfase em (i) sintonia fina automática, bases e grafos de conhecimento e gestão de redes sociais digitais. Atua também na área de bioinformática, com desenvolvimento de ferramentas em parceria com a Fiocruz e o INCA. É vice-decano de internacionalização do Centro Técnico Científico (CTC) e membro do NDE do Curso de Engenharia de Computação, entre outras atividades tecnico-administrativas.
Para maiores informações sobre o conteúdo e como acompanhá-lo, acesse: https://youtube.com/live/apTTNpDXEBE
Defesa de Dissertação de Mestrado do aluno Matheus Kerber.
Título da dissertação: Fast and Accurate Simulation of Deformable Solid Dynamics on Coarse Meshes
Resumo: This thesis introduces a novel hybrid simulator that combines a numerical Finite Element (FE) Partial Differential Equation solver with a Message Passing Neural Network (MPNN) to perform simulations of deformable solid dynamics on coarse meshes. Our work aims to provide accurate simulations with an error comparable to that obtained with more refined meshes in FE discretizations while maintaining computational efficiency by using an MPNN component that corrects the numerical errors associated with using a coarse mesh. We evaluate our model focusing on accuracy, generalization capacity, and computational speed compared to a reference numerical solver that uses 64 times more refined meshes. We introduce a new dataset for this comparison, encompassing three numerical benchmark cases: (i) free deformation after an initial impulse, (ii) stretching, and (iii) torsion of deformable solids. Based on simulation results, the study thoroughly discusses our methods strengths and weaknesses. The study shows that our method corrects an average of 95.9% of the numerical error associated with discretization while being up to 88 times faster than the reference solver. On top of that, our model is fully differentiable and can be embedded into a neural network layer, allowing it to be easily extended by future work. Our contributions also include demonstrating that our method achieves better results in learning and generalization capacity when compared to a purely data-oriented baseline simulator. Data and code are made available on <github link> for further investigations
Orientador: Prof. Dr. Waldemar Celes Filho
Banca: Prof. Dr. Jose Alberto Rodrigues Pereira Sardinha | Prof. Dr. Ivan Fabio Mota de Menezes | Prof. Dr. Leonardo Seperuelo Duarte
Assista a defesa pelo link: https://puc-rio.zoom.us/j/92665440011?pwd=UnNTR3RwcUNFd1hpVUVoUDJKODdodz09#success
Autor: Matheus Kerber Venturelli
Orientador: Waldemar Celes Filho
Data e Hora: 15/03/2024 às 10:00 Local: Videoconferência
Defesa de Dissertação de Mestrado do aluno Pedro Henrique Barroso Gomes.
Título da dissertação: FCGAN: Convoluções Espectrais via Transformada Rápida de Fourier para Campo Receptivos de Abrangência Global em Redes Adversárias Generativas
Resumo: Esta dissertação propõe a Rede Generativa Adversarial por Convolução Rápida de Fourier (FCGAN). Essa abordagem inovadora utiliza convolução no domínio da frequência para permitir que a rede opere com um campo receptivo de abrangência global. Devido aos seus campos receptivos pequenos, GANs baseadas em convoluções tradicionais enfrentam dificuldades para capturar padrões estruturais e geométricos. Nosso método utiliza Convoluções Rápidas de Fourier (FFCs), que usam Transformadas de Fourier para operar no domínio espectral, afetando globalmente os canais da imagem. Assim, a FCGAN é capaz de gerar imagens considerando informações de todas as localizações dos mapas de entrada. Essa nova característica da rede pode levar a um desempenho errático e instável. Mostramos que a utilização de normalização espectral e injeções de ruído estabilizam o treinamento adversarial. O uso de convoluções espectrais em redes convolucionais tem sido explorado para tarefas como inpainting e super-resolução de imagens. Este trabalho foca no seu potencial para geração de imagens. Nossos experimentos também sustentam a afirmação que features de Fourier são substitutos de baixo custo operacional para camadas de self-attention, permitindo que a rede aprenda informações globais desde camadas iniciais. Apresentamos resultados qualitativos e quantitativos para demonstrar que a FCGAN proposta obtém resultados comparáveis a abordagens estado-da-arte com profundidade e número de parâmetros semelhantes, alcançando um FID de 18,98 no CIFAR-10 e 38,71 no STL-10 – uma redução de 4,98 e 1,40, respectivamente. Além disso, em maiores dimensões de imagens, o uso de FFCs em vez de self-attention permite batch-sizes com até o dobro do tamanho, e iterações até 26% mais rápidas.
Orientador: Prof. Dr. Marcelo Gattass
Banca: Prof. Dr. Jose Alberto Rodrigues Pereira Sardinha | Prof. Dr. Italo de Oliveira Matias | Prof. Dr. Jan Jose Hurtado Jauregui | Prof. Dr. Alberto Barbosa Raposo
Assista a defesa pelo link: https://puc-rio.zoom.us/j/97451706923?pwd=b2tNNEQzMmpBeU9vMkFhNzB2bnY0dz09
Defesa de Tese de Doutorado do aluno Luis Fernando Marin Sepulveda.
Título da Tese: GeneralizationoftheDeep Learning Model for Natural Gas Indication in 2D Seismic Image Based on the Training Dataset and the Operational Hyper Parameters Recommendation
Resumo: Interpreting seismic images is an essential task in diverse fields of geosciences, and it’s a widely used method in hydrocarbon exploration. However, its interpretation requires a significant investment of resources, and obtaining a satisfactory result is not always possible. The literature shows an increasing number of Deep Learning, DL, methods to detect horizons, faults, and potential hydrocarbon reservoirs, nevertheless, the models to detect gas reservoirs present generalization performance difficulties, i.e., performance is compromised when used in seismic images from new exploration campaigns. This problem is especially true for 2D land surveys where the acquisition process varies, and the images are very noisy. This work presents three methods to improve the generalization performance of DL models of natural gas indication in 2D seismic images, for this task, approaches that come from Machine Learning, ML, and DL are used. The research focuses on data analysis to recognize patterns within the seismic images to enable the selection of training sets for the gas inference model based on patterns in the target images. This approach allows a better generalization of performance without altering the architecture of the gas inference DL model or transforming the original seismic traces. The experiments were carried out using the database of different exploitation fields located in the Parnaíba basin, in northeastern Brazil. The results show an increase of up to 39\% in the correct indication of natural gas according to the recall metric. This improvement varies in each field and depends on the proposed method used and the existence of representative patterns within the training set of seismic images. These results conclude with an improvement in the generalization performance of the DL gas inference model that varies up to 21\% according to the F1 score and up to 15\% according to the IoU metric. These results demonstrate that it is possible to find patterns within the seismic images using an unsupervised approach, and these can be used to recommend the DL training set according to the pattern in the target seismic image; Furthermore, it demonstrates that the training set directly affects the generalization performance of the DL model for seismic images.
Orientador: Prof. Dr. Marcelo Gattass
Co-orientador: Prof. Dr. Aristófanes Corrêa Silva
Banca: Prof. Dr. Raul Queiroz Feitosa | Prof. Dr. Jan Jose Hurtado Jauregui | Prof. Dr. Kelson Romulo Teixeira Aires | Prof. Dr. António Manuel Trigueiros da Silva Cunha
Assista a defesa pelo link: https://puc-rio.zoom.us/j/93188594035?pwd=NWhtamxYNGZvamZNdmQ4V0wrNVBOZz09
Autor: Luis Fernando Marin Sepulveda
Orientador: Marcelo Gattass
Data e Hora: 24/01/2024 às 09:00
Local: Videoconferência
Defesa de Dissertação de Mestrado do aluno Bruno Francisco Martins da Silva.
Título da dissertação: Vector Stream Similarity Search Methods
Resumo: A vector stream can be modelled as a sequence of pairs ((v1,t1) … (vn,tn)), where vk is a vector and tk is a timestamp such that all vectors are of the same dimension and tk < tk+1. The vector stream similarity search problem is defined as: “Given a (high-dimensional) vector q and a time interval T, find a ranked list of vectors, retrieved from a vector stream, that are similar to q and that were received in the time interval T”. This dissertation first introduces a family of vector stream similarity search methods that do not depend on having the full set of vectors available beforehand but adapt to the vector stream as the vectors are added. The methods generate a sequence of indices that are used to implement approximated nearest neighbour search over the vector stream. Then, the dissertation describes an implementation of a method in the family based on Hierarchical Navigable Small World graphs. Based on this implementation, the dissertation presents a Classified Ad Retrieval tool that supports classified ad retrieval as new ads are continuously submitted. The tool is structured into a main module and three auxiliary modules, where the main module is responsible for coordinating the auxiliary modules and for providing a user interface, and the auxiliary modules are responsible for text and image encoding, vector stream indexing, and data storage. To evaluate the tool, the dissertation uses a dataset with approximately 1 million records with descriptions of classified ads and their respective images. The results showed that the tool reached an average precision of 98% and an average recall of 97%.
Orientador: Prof. Dr. Marco Antonio Casanova
Banca: Prof. Dr. Antonio Luz Furtado | Prof. Dr. Luiz André Portes Paes Leme | Profª Drª Vânia Maria Ponte Vidal
Assista a defesa pelo link: https://puc-rio.zoom.us/j/93760975741?pwd=YXVNcUQzTTlNa2ZlOVhyd1BhLzkwdz09
Essa semana está acontecendo o Curso de Férias – Análise de Futebol – Entenda o futebol através da lente da análise de dados!, realizado pelo Departamento de Informática da PUC-Rio em parceria com o MIT (Massachusetts Institute of Technology).
Os inscritos estão vivenciando o curso em um formato de Workshop, que consistem em:
Palestras com prática
Destaques da carreira
Atividades interativas
de computadores e telas
Projeto de equipe
Veja como tem sido a experiência dos alunos durante o curso que se encerra nesta sexta-feira (19/01).
Autor: Bruno Francisco Martins da Silva
Orientador: Marco Antonio Casanova
Data e Hora: 24/01/2024 às 10:00
Local: Videoconferência