Sandbox‎ > ‎IPT 2010-11‎ > ‎ANDREW T‎ > ‎

HSC Classwork

Data warehouses and Data mining

posted Nov 28, 2010, 2:56 PM by Unknown user   [ updated Nov 28, 2010, 9:45 PM ]

  1. Describe what each is:
    • A data warehouse is a database that collects related data from different sources. It stores raw data that can be analysed, and may be linked.
    • Data mining is a process that looks for relationships and patterns in the data in a database or a data warehouse. It sorts through data to come up with connections between the data.
  2. Outline the benefits that can be gained by organisations that keep these:
    • If a supermarket decides to keep a data warehouse to see the prices of different foods and their trends over a long period of time, they may be able to make decisions based on this.
    • If a fast food outlet uses data mining they may be abkle to find trends, such as what people in a specifc suburb like to buy, such as if people in Carlingford like Big Macs, they could raise the price in this area.
  3. Explain with examples how each could be potentially abused:
    • Data warehousing could pull information that may be considered private, such as personal information, depending on what data and where it is being pulled from. If someone decided to specifically create a data warehouse to pull private details, this could quite easily be used for identify theft and fraud.
    • Data mining could be used to link seperate pieces of information, say when there are different pieces of data, for example one entry has first name and last name and address, while another has first name, last name, and also credit card number. Through data mining, these two could be linked so that someone could have these details and use them, again this raises the issue of privacy.
    • These two examples raise the issue of data ownership, as the boundaries become slightly blurred.
  4. Using the internet, identify two companies that do each and justify the use of such methods for each company
    • Data warehousing:
      • Wikileaks
        • Wikileaks would need to keep a data warehouse due to the insanely amount of data that they pull from other souces, and also how much of this data has to be analysed by themselves and external sources. For example, with the leaks of data from a source (such as the Pentagon) the data (some 250,000 files) would often be need to be searched, and possibly sorted by anyone trying to see the leaked files.
      • Google
        • Google would have to keep a large data warehouse not only of sites for their search results, but also would keep details of anyone who had a Google Account or of anyone who has an account that is linked to Google. This is to ensure that someone who already has a Google Account can also use other services (such as Blogger) so that they are able to tell who is doing what, and so that the other service would be able to have some of the details of the person who owns the account.
    • Data mining:
      • Amazon
        • Amazon may want to do data mining to look for patterns in what their customers buy, such as suggested items.

Databases (Afternoon)

posted Nov 3, 2010, 8:45 PM by Unknown user

  1. The type of database would be a relational database
  2. The entities represented are students, projects, assessments, marks
  3. Analysing occurring in the database is searching, sorting and what-if analysis
  4. Records and fields in the database are the assessments and each students performance in these

Databases

posted Nov 3, 2010, 2:11 PM by Unknown user   [ updated Nov 3, 2010, 2:50 PM ]

data - base (n)
 
base - anything you can stand upon
 
database - a place where data can stand
 
The data in a database has a particular structured
 
Flat File
  • Is a simple file with rows of textual data
  • Usually the rows describe serveral of the same kind of things
  • A phonebook is an example of a f;at file database
  • Smaller in file size
  • Versatile (no strict rules)
  • Spreadsheets
Relational
  • It is several flat file databases
  • Each part is related
  • It handles complexity well
  • Can be customised for specific purposes
  • Usually have security built in
  • Microsoft Access
  • 3 layers
    • Client folders
    • Stock
    • Invoices
      • Inside the invoices section there are other flat files databases
    • Each is a flat file database
Timetable
  • Subject
  • Classes
  • Houses
  • Teachers (days at school)
  • Grade
  • Roll class
  • Periods
  • Playground duty
  • Subject choice
Databases
  • Entities - the real world objects that the database describes (students, rooms, teachers, classes)
  • Fields - columns, the things that we know about the entity (student number, roll class, grade)
  • Records - rows, each row contains pieces of data that are represented in the fields
  • Data dictionary - is a comprehensive description of the content and type of every piece of data in the database
    • Name
      • What type of data is it (e.g. text)

19/10/10

posted Oct 18, 2010, 4:01 PM by Unknown user

Group Project evaluation:
 
Good points:
  • Overall professional look (surprisingly makesup almost all of our strong points)
 
Agreement with:
  • Navigation to major parts of site, we could have made it slightly easier to access everything else (although sidebar may not have been the best solution)
  • Company blog is also slightly brief, again, could have put more effort into it
Disagreement with:
  • Lack of academic achievements (Cheuk had listed a multitude of business qualifications and experience)
  • Improve website design (I'm sorry, you're being way too vague here)

Personality test

posted Oct 17, 2010, 4:04 PM by Unknown user   [ updated Oct 17, 2010, 4:06 PM ]

INTJ
 
44 38 75 56

1-5 of 5