CSCI/ARTI 8950 Machine Learning

CSCI/ARTI 8950 Machine Learning

Spring 2009: Tuesdays and Thursdays 3:30pm - 4:45pm & Wednesdays 3:35pm - 4:25pm, Boyd GSRC 208

Instructor: Prof. Khaled Rasheed
Telephone: (706)542-3444
Office Hours: Tuesday: 1-2:30pm and Wednesday: 4:35-6:00pm or by email appointment
Office Location: Room 219B, Boyd GSRC
Email: khaled@cs.uga.edu


Objectives:

Machine learning is a sub-field of artificial intelligence which is concerned with computer programs that can automatically improve their capabilities and/or performance by acquiring (learning) experience. The main objectives of this course are to provide students with an in-depth introduction to machine learning theory and methods and an exploration of research problems in machine learning and its applications which may lead to work on a project or a dissertation. The course is intended primarily for computer science and artificial intelligence graduate students. Graduate students from other departments who have a strong interest and sufficient experience in artificial intelligence may also find the course interesting.

Recommended Background:

CSCI/PHIL 4550/6550 Artificial Intelligence or CSCI 4560/6560 Evolutionary Computation (or permission of the instructor). Familiarity with basic computer algorithms and data structures and at least one high level programming language.

Topics to be Covered:

  • Part I: Machine learning techniques: Selected from inductive learning, decision trees, neural network approaches, evolutionary computation approaches and classifier systems, reinforcement learning, statistical and Bayesian learning, instance-based learning, explanation-based learning and computational learning theory.
  • Part II: Machine learning applications: Selected from data mining, bioinformatics, biomedical modelling, medical diagnosis, text classification, pattern recognition and/or other contemporary applications.

    Expected Work:

    Reading; assignments (some include programming and/or running existing programs); midterm; final and term project and paper. (Unless otherwise announced by the instructor: all assignments and all exams must be done entirely on your own.)

    Academic Honesty and Integrity:

    All academic work must meet the standards contained in "A Culture of Honesty." Students are responsible for informing themselves about those standards before performing any academic work. The penalties for academic dishonesty are severe and ignorance is not an acceptable defense.

    Grading Policy:

  • Assignments: 30% (Programs, homeworks, attendance, paper presentation)
  • Midterm Examination: 20%
  • Final Examination: 25%
  • Term Project: 25% (includes term paper and presentation)
    Students may work on their term projects in groups of up to three students each. The above distribution is only tentative and may change later. The instructor will announce any changes.

    Assignment Submission Policy

    Assignments must be turned in by the assigned deadline. Late assignments will not be accepted. Rare exceptions may be made by the instructor only under extenuating circumstances and in accordance with the university policies.

    Course Home-page

    A variety of materials will be made available on the ML Class Home-page at http://www.cs.uga.edu/~khaled/MLcourse/, including handouts, lecture notes and assignments. Announcements may be posted between class meetings. You are responsible for being aware of whatever information is posted there.

    Lecture Notes

    Copies of some of Dr. Rasheed's lecture notes will be available at the bottom of the class home page. Not all the lectures will have electronic notes though and the students should be prepared to take notes inside the lecture at any time.

    Textbook in Bookstore

  • "Machine Learning", Tom Mitchell. McGraw-Hill, 1997. (Required.)

    Additional Books

  • "Data Mining: Practical Machine Learning Tools and Techniques (2nd edition)", Ian Witten & Eibe Frank. Morgan Kaufmann, 2005.
  • "Evolutionary Computation : Towards a New Philosophy of Machine Intelligence", David Fogel. IEEE press, 1999.

    Web Resources

  • David Aha's Machine Learning Resources
  • University of California at Irvine ML Repository
  • The WEKA Machine Learning Project

    Announcements:

  • [5-6-2009] The final exam will be tomorrow Thursday, May 7th from 3:30 pm to 6:30 pm in the same room where the class met during the semester. The exam will be open book and notes and will cover all the material covered in the course including all handouts. You should bring your lecture notes, handouts and any books or notes you anticipate using in the exam. The use of cell phones, laptops or any computers or communication devices will not be allowed in the exam. One paper from among those presented by students in class will be selected at random to become the subject of one question in the final. Copies of that paper will be made for all of you and therefore you need not bring copies of any or all of those papers to the exam.
  • [5-6-2009] The course project reports are due at the final exam. For the project report format, please write it as a conference paper of about 8 two-column pages or 12 single-column pages (there is no restriction on size though). You should include an introduction, a mention of related work if any, a description of your experiments and results and a conclusion. In the introduction or elsewhere in the paper you should describe the domain that you applied your ML technique(s) to, in enough detail for the reader to appreciate the significance and difficulty of the problem. Please bring a hard copy to the exam and include your email addresses as well as the URLs of any demo/supporting web pages. There is a slight chance that I might contact you soon after the submission deadline (within 48 hours) requesting codes, clarifications or more data. Thus it will be helpful to include the emails of all the group members lest one or more are going to leave town immediately after the submission.
  • [4-29-2009]: All your scores so far are posted HERE. Please make sure that your scores are properly recorded. Note that the projected grades are old and do not reflect the work that was done after the midterm.
  • [1-29-2009] My former student Cesar Koirala has kindly prepared a power point presentation about Weka that you can find HERE. This gives step-by-step instructions on how to use the package for your homework assignments and provides a complete tutorial.

    Papers

  • "Least significant bit steganography detection with machine learning techniques" Shen Ge et al., 2007. [Chandana Kaza][3-18] {download}
  • "User-Oriented Document Summarization through Vision-Based Eye-Tracking" Songhua Xu et al., 2009. [ManChon U][3-18] {download}
  • "Learning to compress images and videos" Li Cheng and S. Vishwanathan, 2007. [Sal LaMarca][3-18] {download}
  • "Simulation of mass spectra of noncyclic alkanes and alkenes using artificial neural network" M. Jalili-Heravi and M. Fatemi, 2000. [Jun Han][3-25] {download}
  • "Distinguishing Protein-Coding from Non-Coding RNAs through Support Vector Machines" Jinfeng Liu et al., 2006. [Shasha Liu][3-25]{download}
  • "Analysis of human motion using snakes and neural networks" Ken Tabb et al., 2000. [Qian Ma][3-25] {download}
  • "Browsing on Small Screens: Recasting Web-Page Segmentation into an Efficient Machine Learning Framework" Shumeet Baluja, 2006. [Qi Li][4-1] {download}
  • "A comparative study on diabetes disease diagnosis using neural networks" Hasan Temurtas et al., 2009. [Sascha Strauss][4-1] {download}
  • "Hybrid genetic algorithms and support vector machines for bankruptcy prediction" Sung-Hwan Min, 2005. [Rahul Lakshmanan][4-1] {download}
  • "Clustering by Passing Messages Between Data Points" Brendan J. Frey and Delbert Dueck, 2007. [Dajiang Zhu][4-8] {download}
  • "Identifying Predictive Structures in Relational Data Using Multiple Instance Learning" Amy McGovern and David Jensen, 2003. [Harini][4-8] {download}
  • "Discrete combinatorial circuits emerging in neural networks: A mechanism for rules of grammar in the human brain?" Friedemann Pulvermllera and Andreas Knoblauch, 2009. [Matthew Eavenson][4-8]{download}
  • "Online Feature Selection for Pixel Classification" Karen Glocer et al., 2005. [Vasim Mahamuda][4-15] {download}
  • "Support Cluster Machine" bin Li et al., 2007. [Ekhlas Sonu][4-15] {download}
  • "Scaling Up Context-Sensitive Text Correction" Andrew J. Carlson et al., 2001. [Shahab Razavi][4-15] {download}
  • "Overcoming the Brittleness Bottleneck usingWikipedia: Enhancing Text Categorization with Encyclopedic Knowledge" Evgeniy Gabrilovich and Shaul Markovitch, 2006. [Sheng Yin][4-21] {download}
  • "Focused Crawling Using Context Graphs" M. Diligenti et al., 2000. [Ankur Oberai][4-21] {download}
  • "Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks" Javed Khan et al., 2001. [Liang Wang][4-21] {download}
  • "Improving Text Classification Using EM with Background Text" S. Zelikovitz, H. Hirsh, 2005. [Thomas Drapela][4-21]{download}
  • "Learning to Detect Objects in Images via a Sparse, Part-Based Representation" S. Agarwal et al.,2004. [Matthew Losanno][4-21] {download}
  • "An Empirical Research on Extracting Relations from Wikipedia Text" 2008. [Justin Martin][4-23] {download}
  • "Evolutionary Function Approximation for Reinforcement Learning" Shimon Whiteson and Peter Stone, 2006. [Zhen Li][4-23] {download}
  • "An Introduction to Variable and Feature Selection" Isabelle Guyon and Andre Elissee, 2003. [Naveed Ahmed][4-23] {download}
  • "A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function" Krishnan et al., 2003. [?][4-21] {download}

    Assignments:

  • Assignment 1
  • Assignment 2
  • Assignment 3
  • Assignment 4
  • Assignment 5
  • Assignment 6
  • Assignment 7

    Lecture Notes:

  • Introduction
  • Chapter 1
  • Chapter 2
  • Chapter 3
  • Chapter 4
  • Chapter 5
  • Chapter 6
  • Chapter 8
  • Evolutionary Computation
  • Chapter 7
  • Chapter 13
    The course syllabus is a general plan for the course; deviations announced to the class by the instructor may be necessary.

    Last modified: May 6, 2009.

    Khaled Rasheed (khaled[at]cs.uga.edu)