Amazon Machine Learning – Take Two

Since my last post analyzing Amazon Machine Learning (AML), I got quite some feedback from friends and colleagues about my findings. One of them included criticism about the setup of my test especially on the formula I used. As I consider this discussion valuable, with this post I would like to re-run my analysis with a slightly different approach to see if that changes anything significantly.

Summary of Feedback

The feedback I have received can be summarized into the following to statements:

  • For the set of values with a=1 and a=2, there are (in total) only seven values provided to the model (in fact there are more, but they are only repeating what is already known to the model). That is to say that the model only has very limited knowledge in that space. Yet, my approach asks explicitly in that area trying to evaluate the quality of predictions.
  • Modulo computation with a prime as a base creates a field. Performing calculus in such a field can become a very complex task (due to the fact that it is wrapping higher numbers to lower ones). Additionally, it does not represent a typical decision behavior where additional interest usually also results in additional willingness to buy a product.

New Setup

I value this kind of feedback – and in fact reveals certain properties which are not desirable in such an analysis. That is why I would like to repeat the analysis in my original post with the following deviating setup:

  • We again have a, b, and c as parameters, being in range {1, 7}.
  • Again, we also will use the formula a * b – c for computing an intermediate value. Note, however, that this time there is no modulus calculation involved.
  • To map our intermediate result to a binary decision, the condition intermediate_result > 10 is evaluated.
  • With a, b, c in {1,7}, we have a space of 343 values. Out of these there are 156 tuples which are considered truthy. The rest (187) are falsy.
  • For later verification, we leave the following values out of the training data:
    • a = 6, b = 6, c = 6 (intermediate result = 30)
    • a = 7, b = 2, c = 3 (intermediate result = 11)
    • a = 7, b = 2, c = 4 (intermediate result = 10)
    • a = 7, b = 2, c = 5 (intermediate result = 9)

(continued on next page)

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Leave a Reply

Your email address will not be published. Required fields are marked *

*