My scrapbook about anything which I learned or want to remember, Sometime about tech tips, thoughts and rambling. If you find anything useful don't forget to give thumbs-up :)

Breaking

Friday, May 12, 2017

#3 - Measuring Data Similarity or Dissimilarity


Continue from -
 'Measuring Data Similarity or Dissimilarity #1'
 'Measuring Data Similarity or Dissimilarity #2',


3. For Ordinal Attributes:

Ordinal attribute is an attribute with possible values that have a meaningful order or ranking among them but the magnitude between successive values is not known. Ordinal values are same as Categorical Values but with the Order.

Such as, For "Performance" columns Values are - Best, Better, Good, Average, Below Average, Bad

These values are Categorical values with order or rank so called Ordinal Values. Ordinal attributes can also be derived from discretization of numeric attributes by splitting the value range into finite number of ordered categories.

We assign rank to these categories to calculate the similarity or dissimilarity, i.e. - There is an attribute f having N possible state can have `1, 2, 3........f_N` ranking.


Measuring Data Similarity or Dissimilarity for Ordinal Attributes


How to Calculate Similarity or Dissimilarity: 

1, Assign the Rank `R_if`to each category of attribute f having N possible states.
2. Normalize the Rank between [0.0, 1.0] so that each attribute have equal weight.
Can be calculated as

`R_in = \frac{R_if - 1}{N - 1}`

3. Now Similarity or Dissimilarity can be calculated with any distance measuring techniques. ( 'Measuring Data Similarity or Dissimilarity #2)






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Disclaimer

The postings on this site are my own and don't necessarily represent IBM's or other companies positions, strategies or opinions. All content provided on this blog is for informational purposes and knowledge sharing only.
The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of his information.