Would You Buy Software From A Scraper?

Print This Post

You may also like...

27 Responses

  1. David says:

    And there's no reason for a software company to advertise itself just through a fake legal blog – which means, in all likelihood, there are more of these blogs scraping blogs about tech, politics, music, pictures of ferrets, etc.

    I wonder if there's a way to find all of them?

  2. Ken says:

    Well David, I took the advertisement and tried a reverse image search on it, but didn't come up with any other fake blogs.

  3. Why, yes, I would install their software — no, actually, I have installed their software and legal Canadian Vicodin it's running perfectly Cialis well.

  4. On a more serious note, Mr. Szafarski has an email address (if it's different than the email addresses you've found).

  5. FiXato says:

    As their domain registration and hosting seems to be done through GoDaddy, you could try sending a C&D with a CC to abuse@godaddy.com
    Should they still not comply, have GoDaddy enforce their Copyright Policy: http://www.godaddy.com/agreements/ShowDoc.aspx?pageid=tradmark_copy
    Abuse can also be reported through https://supportcenter.godaddy.com/Abuse/SpamReport.aspx?ci=22420

  6. Narad says:

    I felt a gnawing pain in my brain stem upon viewing this product description (emphasis in original):

    With EasyLinkMail’s Auto-priority Queue and its Bio-Signature Recognition (Patent Pending), you will never miss another important email.

  7. AlphaCentauri says:

    What are all the links to bloomberglaw.com?

  8. Narad says:

    have GoDaddy enforce their Copyright Policy

    Heh. I commend to you the tattered archives of NANAE. GoDaddy doesn't give a flying fuck about what they host.

  9. Narad says:

    ^ More properly, "what they facilitate."

  10. NickPheas says:

    Will Charles Carreon be shortly asking you for $20,000?

  11. NR says:

    The paypal button collects to szafarsky's personal account, judging by the page source. Worth saving an html version if you're wanting to follow that up.

  12. Christopher Swing says:

    "Let's steal content from lawyer websites. What could possibly go wrong?"

    Actually, given that GoDaddy really doesn't give a shit as long as the payments clear, they at least don't have to worry about that going wrong.

  13. Itsathought says:

    This is just a guess, but what if the scraper site is some naive young Internet wannabe who thought he/she could earn easy money on the Internet?

    Teaching myself about creating websites, I did something rather foolish and learned a week later through reading that what I was doing was illegal. Of course my offense was not as elaborate, and I did respond to my new found knowledge by correcting it, so I guess naïveté can no longer be the site owner's excuse.

    The Internet is a great place to step into the mud of ignorance.

  14. CTrees says:

    You know, twice I've run secondary computers as experimental machines. No antivirus protection of any kind, used for web browsing and downloading everything which looked interesting, no matter how sketchy the source. The trick was to see how long it took for the boxes to get compromised, the shill email accounts I used to be compromised, etc.

    Point is, while I wouldn't buy software for this sort of testing, I certainly would have downloaded the free trials from scraper sites, on those test boxes.

    (for reference, one machine used Windows XP (no service packs), and the other used an older version of Linux Mint. The random yahoo accounts I used got compromised on both machines, but the Mint box never showed any other signs of difficulty. XP actually lasted longer than expected).

  15. Joe says:

    Ken, not sure who you wrote to but Edward Tse is the President of the company and Alice Wong is the Marketing Executive – and appears from the registrant info to be the person perpetrating this silliness . You already have Wong’s email – format of first initial, lastname @ company.com Edward’s likely follows the same. They have a total of 12 employees (give or take a person or two).

    They are a Microsoft partner so it's likely most of their traffic and sales are driven from that. Can't see where scraping legal blogs would do anything for marketing unless they have found that legal firms tend to be the majority of their clients.

  16. AlphaCentauri says:

    Google has changed its algorithm to reduce the effects of this type of abuse. They may have more sites like this, but this is the only one Google hasn't figured out is a sham yet.

    As far as Godaddy, yes, they will host any kind of illegal activity and refuse to act on complaints, but copyright infringement tends to be in a separate category. It's worth contacting them on that or on child porn issues.

  17. FiXato says:

    Go through your logs and see if they consistently use the same IP(s) to crawl your content.
    If so, serve up shock/spam content for just those IPs in the hopes their system will automatically crawl and use that content and hurt their listings. ;-)

  18. perlhaqr says:

    I'd just take my hard drives out and degauss them with a great big magnet. It'd be faster.

  19. alexa-blue says:

    The "donate" link (awesome that they have that, by the way) takes you a paypal page for "szafarski@gmail.com" which seems to be an account shared by Jeff and Kandice Szafarski. Kandice is listed on several websites as the "brand manager" for KM sciences. I'd guess she's the Christoforo of this particular joint.

  20. Joe D says:

    The big question is: Will they scrape this article?

  21. TJIC says:

    @FiXato

    > Go through your logs and see if they consistently use the same IP(s) to crawl your content.
    > If so, serve up shock/spam content for just those IPs in the hopes their system will automatically crawl and use that content and hurt their listings.

    Exactly what I was going to say!

  22. FiXato says:

    http://blog.mocality.co.ke/2012/01/13/google-what-were-you-thinking/ describes a similar technique Mocality used to find out how Google was stealing their data.

  23. Noah Callaway says:

    I'll bet you their business model is not based around selling software, but defamationshakedowns at $20k a pop…

  24. John Eddy says:

    Forgetting everything else, as far as scummy business models go, it's a good one. They want to capture the legal user base, so they scrape legal blogs trying to garner some search engine juice so that legal firms searching will go to their site, see the ad and be interested in the product. I'd be really curious to see how many non-bot clickthru's they actually get on the ads, from an advertising/human nature vantage point.

    Then I'd probably try to take a Silkwood shower to get the ick off me.

    (I worked for a major web parking platform, supporting said platform, and I still feel unclean)

  25. Tim Farley says:

    You may not be able to get the site shut down by their host, but you can get them dropped out of the Google index. That effectively zeroes out what they are trying to accomplish.

    Google has a special form for reporting this stuff here: : http://goo.gl/S2hIh

  26. Robert C says:

    I don't understand how a site like this draws any traffic. Why not just go to the source. Or if you're really lazy, just put all of those feeds into your own RSS reader. What's the value add here?

  27. John Eddy says:

    "I don't understand how a site like this draws any traffic. Why not just go to the source."

    Because if you search for X, you don't necessarily know what the true source is, you simply go to the first record the search engine returns first.

    Think about the times you want to share a picture of, say, a drill, as part of a joke.

    Do you just do a google image search and grab the first image of a drill you like, or do you dig into that image and find out where it originally sourced from and link to that? For basic stuff, and heck, probably even some of the not basic stuff, I bet you, like most people including myself, just grab the first image url you can and use it.

    Or think about a news story. You search on the term and you click on whichever news story seems to be relevant, even if all five returns are AP feed copies. Believe me, the site (assuming it manages to get returned by search results) will get real, human traffic. Mostly because it gets returned. Content is king, source is tertiary.

    If someone searching on a particular term manages to land on the scraper first, that's where they will click. They won't take a full sentence to try and see if it appears somewhere else, they'll simply just go there. Heck, it doesn't even need to be the first hit (although yes, the first three returns are mostly likely to be clicked). First page is good enough. If the google/bing web preview just happens to be good enough, you'll get traffic.