Sunday, 24 August 2008

Category matchings with pad file categories

Since I did the improvements to softtester earlier this year, I deciced to create a categories table with an ID.

Afterall I only have a set number of categories so it makes sense to have an ID instead of values in my pads table.

I have a pattern matchings function which takes my catgories and trys to match them with the pad file categories for each submission. I try removing spaces, odd symbols etc. However, its not till this week that I'm noticed how different sites use completely different categories from those shown in the pad spec.

I guess I should strictly just reject these pad files.
However up till now, I've thought my function wasn't working and that somehow the submissions were correct and my function was flawed. I'm still not 100% convinced.

However, I'd like to create a complete list of categories which I can match with my categories.

Thoughts?

by JM

EDIT: I did some work yesterday on my function and I think my function is now much improved. Had 28 fails over night from friday to saturday and 3 last night / morning.

2 comments:

  1. The PHP for Softwarelode only adds PADs if the categories match exactly; if they don't the PAD gets rejected.

    I'm not sure if this is a good or bad thing really. However, I think you method of defining aliases that can be matched to your categories is a good one if you want to allow inexact matching.

    ReplyDelete
  2. Well I think lost listings is a bad thing, providing they aren't duplicates.

    I've just added some code which looks at the categories pad file value, splits it into an array. Then looks at the padspec specific value and does an instr. Then matches on that.

    Although thats at the end of the function. Kind of last resort.

    I think maybe if all this works out and I only get a few failures I might start rejecting the pad file.

    I'll also have to go round my -1 values in the database and try and populate those. Then do a count on ones that are still -1 and maybe delete them. I have 2500 -1's at the moment, which isn't a huge amount.

    That reminds me, I must add my spidering functions back in, now that things have settled down a bit. Although I think thats a topic in itself, e.g the order in which you do it.

    ReplyDelete