Resources for Learning & Research
Software Tools for Professional Writing
- LaTeX – a high-quality typesetting system for all computing platforms.
- TikZ and PGF are TeX packages for creating sophisticated graphics programmatically. TikZ is built on top of PGF.
- Overleaf.com for collaborative cloud-based authoring using LaTeX.
- Quarto – An open-source scientific and technical publishing system.
- Grammarly – an AI-powered writing assistant.
- MathPix extracts equations as LaTeX code from PDFs or handwritten notes.
How to do Good Research and Publish
- IEEE Computer Society digital library
- ACM Digital Library
- Google Scholar
- Cite Seer X
- Ultimate research assistant
- How to do good research and publish
- How to write a good research paper
- How to read a paper
- Scientific writing for computer science students
- Efficient Reading of Technical Papers in Science and Technology
- Improving your Technical Writing Skills
- Written Communication
- Oral Communication
Writing an Annotated Bibliography
- An Annotated Bibliography Project Specification
- A LaTeX Template for Preparing an Annotated Bibliography (PDF)
- A LaTeX Template for Preparing an Annotated Bibliography (LaTeX source file)
- A BibTeX file (nlp.bib) for the Annotated Bibliography
Automated Question Generation
- Dialog & Discourse — An Open Access Journal on Language
- Third Workshop on Question Generation QG2010 — Proceedings
- Second Workshop on Question Generation QG2009 — Proceedings
- NSF Workshop on the Question Generation Shared Task and Evaluation Challenge
- Automatically Generating Questions in Multiple Variables for Intelligent Tutoring
- Automatic Generation of Questions for On-line Evaluation
- Automatic Question Generation for REAP.PT Tutoring System
- Automatic Question Pattern Generation for Ontology-based Question Answering
- Automatic Generation of Multiple Choice Questions From Domain Ontologies
- A Taxonomy of Questions for Question Generation
- Automatic Generation of Multiple choice questions from domain ontologies
- Automatic Question Generation for Vocabulary Assessment
- Automatic Question Generation from Queries
- DynaLearn Question Generation and Answering
- FAST An Automatic Generation System for Grammar Tests
Systematic Research Literature Search and Writing Reviews
- What is a Literature Review?
- Analyzing the past to prepare for the future: writing a literature review
- Booth, W., Colomb, G., and Williams, J. (1995). The craft of research. Chicago: University of Chicago Press. Summary
- Towards a Framework of Literature Review Process in Support of Information Systems Research
- Guidelines for Performing Systematic Literature Reviews in Software Engineering — Summary
- Guidelines for Performing Systematic Literature Reviews in Software Engineering — Complete Article
- Empirical Studies of Agile Software Development: A Systematic Review
- Factors associated with success in medical school: systematic review of the literature
Computer Vision and Image Processing
- IPOL Journal — Image Processing On Line IPOL is a research journal of image processing and image analysis. Each article contains a text on an algorithm and its source code, with an online demonstration facility and an archive of experiments. Text and source code are peer-reviewed and the demonstration is controlled. IPOL is an Open Science and Reproducible Research journal.
- First Principles of Computer Vision.
- Supercamera: More Pixels Than You Know What To Do With.
- OpenCV: An open source C/C++ library with over 2,5000 implemented algorithms for real-time computer vision applications such as object recognition, shape detection, depth estimation, tracking moving objects, extracting 3D models, and overlaying augmented reality.
- Image Processing and Analysis in Java
- Computer Vision Platform using Python
- ImageMagick: A software suite to create, edit, compose, or convert images.
- Geograph Project: Aims to collect geographically representative photographs and information for every square kilometer of Great Britain and Ireland.
- Flickr — billions of images
- NASA Image Galleries
- Professor Alireza Saberi’s Image and Video Processing lectures – YouTube Playlist
- Richard Hamming’s Lectures on Digital filters – Part 1
- Richard Hamming’s Lectures on Digital filters – Part 2
- Richard Hamming’s Lectures on Digital filters – Part 3
- Richard Hamming’s Lectures on Digital filters – Part 4
- Computer Vision: Algorithms and Applications, a book by Richard Szeliski, Microsoft Research
Computational Analysis of Natural Language and Speech
- awesome-nlp: a curated list of resources dedicated to natural language processing (NLP)
- Association for Computational Linguistics
- Natural Language Toolkit (NLTK)
- WordNet: A Lexical Database for English.
- Linguistic Data Consortium.
- RILM: The SRI Language Modeling Toolkit.
- Google NGram Viewer
- Stanford CoreNLP
- Apache Lucene: A high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
- MontyLingua: A Free, commonsense-enriched natural language understander for English.
- tm: A framework for developing text mining applications with R.
- NLP Research at Microsoft
- NLP Research at Google
- NLP Research at IBM
- IBM Watson
- Open Text Summarizer
- Weka 3: Data Mining Software in Java
- GATE: General Architecture for Text Engineering (GATE) — an open source software capable of solving almost any text processing problem.
- Open NLP: The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text.
- HTK: The Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing.
- VoiceBox: A speech processing toolkit based on MATLAB.
- Praat: doing phonetics by computer.
- openSMILE: The Munich versatile and fast open-source audio feature extractor.
Data Management and Big Data
- DB-Engines
- Forrester Big Data Blog
- DataVarsity
- Data Stories
- ZDNet Blog on Big Data
- IBM Big Data Cloud
- Oracle Data Warehouse Insider
- Square Kilometer Array Radio Telescope
- Heroku Cloud Application Platform
- Heroku Dev Center
- Download PostgreSQL from EnterpriseDB: PostgreSQL is a powerful, open source object-relational database system. It has more than 15 years of active development and a proven architecture that has earned it a strong reputation for reliability, data integrity, and correctness.
- Download pgAdmin: An open source tool for PostgreSQL administration and database development.
- PostgreSQL Studio: Open Source Web Interface for PostgreSQL.
- TeamPostgreSQL: An open source, Web browser-based tool for PostgreSQL administration and database development.
- SQL Power Architect: An open source tool for conceptual data modeling and profiling.
- PostgreSQL Wiki: This wiki contains user documentation, how-tos, and tips ‘n’ tricks related to PostgreSQL. It also serves as a collaboration area for PostgreSQL contributors.
- A Relational Algebra interpreter
- XML data repository.
- The Mondial database.
- Apache Mahout: Scalable machine learning and data mining.
- Automated test data generation using Datanamic Data Generator MultiDB 2011
- GenerateData.com
- DTM Data Generator
- Database test data generation (PhD Thesis, 2004)
Books
- An interactive web-based data visualization with R, plotly, and shiny.
- An example of a book written in RMarkdown and is made available (both the PDF and source files) under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0). Book’s GitHub repository. Find the HTML version of this book here.
- Write HTML, PDF, ePub, and Kindle books with Bookdown.
- Probabilistic Programming in Python PyMC3.
- A visual introduction to probability and statistics.
- Kevin Patrick Murphy. Machine Learning: a Probabilistic Perspective (2004).
- Kevin Patrick Murphy. Probabilistic Machine Learning: An Introduction (2022).
- Kevin Patrick Murphy. Probabilistic Machine Learning: Advanced Topics (2023).
- David Barber. Bayesian Reasoning and Machine Learning.
- Chris Bishop. Pattern Recognition and Machine Learning.
- David J. C. Mackay. Information Theory, Inference, and Learning Algorithms.
- Christopher Manning. Introduction to Information Retrieval.
- Hossein Pishro-Nik. Introduction to Probability, Statistics, and Random Processes.
- Stanley H. Chan. Introduction to Probability for Data Science. An undergraduate textbook on probability for data science.
On This Page
Related Links