reasoning
Latest
Vito Trianni
A fixed-term research position is open for a post-doc, or for a PhD student nearing the end of his doctoral program. The goal of the research is to study hybrid collective intelligence systems for decision support in complex open-ended problems. It involves the design and implementation of a hybrid collective intelligence system to exploit the interaction between human experts and artificial agents based on knowledge graphs and ontologies for knowledge representation, integration and reasoning.
Dr. Robert Legenstein
For the recently established Cluster of Excellence CoE Bilateral Artificial Intelligence (BILAI), funded by the Austrian Science Fund (FWF), we are looking for more than 50 PhD students and 10 Post-Doc researchers (m/f/d) to join our team at one of the six leading research institutions across Austria. In BILAI, major Austrian players in Artificial Intelligence (AI) are teaming up to work towards Broad AI. As opposed to Narrow AI, which is characterized by task-specific skills, Broad AI seeks to address a wide array of problems, rather than being limited to a single task or domain. To develop its foundations, BILAI employs a Bilateral AI approach, effectively combining sub-symbolic AI (neural networks and machine learning) with symbolic AI (logic, knowledge representation, and reasoning) in various ways. Harnessing the full potential of both symbolic and sub-symbolic approaches can open new avenues for AI, enhancing its ability to solve novel problems, adapt to diverse environments, improve reasoning skills, and increase efficiency in computation and data use. These key features enable a broad range of applications for Broad AI, from drug development and medicine to planning and scheduling, autonomous traffic management, and recommendation systems. Prioritizing fairness, transparency, and explainability, the development of Broad AI is crucial for addressing ethical concerns and ensuring a positive impact on society. The research team is committed to cross-disciplinary work in order to provide theory and models for future AI and deployment to applications.
Llama 3.1 Paper: The Llama Family of Models
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.
Improving Language Understanding by Generative Pre Training
Natural language understanding comprises a wide range of diverse tasks such as textual entailment, question answering, semantic similarity assessment, and document classification. Although large unlabeled text corpora are abundant, labeled data for learning these specific tasks is scarce, making it challenging for discriminatively trained models to perform adequately. We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. In contrast to previous approaches, we make use of task-aware input transformations during fine-tuning to achieve effective transfer while requiring minimal changes to the model architecture. We demonstrate the effectiveness of our approach on a wide range of benchmarks for natural language understanding. Our general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, significantly improving upon the state of the art in 9 out of the 12 tasks studied. For instance, we achieve absolute improvements of 8.9% on commonsense reasoning (Stories Cloze Test), 5.7% on question answering (RACE), and 1.5% on textual entailment (MultiNLI).
Seeing things clearly: Image understanding through hard-attention and reasoning with structured knowledges
In this talk, Jonathan aims to frame the current challenges of explainability and understanding in ML-driven approaches to image processing, and their potential solution through explicit inference techniques.
reasoning coverage
5 items