Go Back   PHP Link Directory Forum > PHP Link Directory (phpLD) > Mods and contribution Discussion

Mods and contribution Discussion This forum is only for discussion of modding phpLD. For specific mod releases, please see the appropriate forum.

Closed Thread
 
Thread Tools Display Modes
Old 01-21-2006   #1
ibold
 
Join Date: Dec 2005
Posts: 31
Default Working DMOZ import Mod

I have in hand a working DMOZ import mod, fixed from the script provided in this thread:
{link to old forum removed}

Please PM me if you are interested in getting a copy of it.

If it's helpful to you, it did cost me a bit of money to have fixed via scriptlance, and a donation would be appreciated (don't go overboard, just if you have something you can spare)
My paypal is:
admin at ibold.net

Once I get a mods help I can post the attachment in this post. Enjoy

EDIT:
I uploaded it to my own server:
{unfortunately this link was broken and is now removed}
__________________
Directory of Directories << Submit your directory so that other people can find you and add their site!

Last edited by David; 03-23-2008 at 10:15 PM.
ibold is offline  
Old 01-21-2006   #2
dagger
Supporter
 
Join Date: Dec 2005
Posts: 22
Default

hi ibold

do you mean you want some $ to give it to me or its free ? because iam looking for this mod :(
__________________
Please double check your sig link
dagger is offline  
Old 01-21-2006   #3
djhomeless
Supporter
 
Join Date: Jan 2006
Location: London
Posts: 75
Send a message via AIM to djhomeless Send a message via MSN to djhomeless Send a message via Yahoo to djhomeless
Default

I am more than happy to throw down some green for a working mod. I've sent a PM, if you can send me a copy of the working mod I would really appreciate it.

I setup an auction on Scriptlance to fix this as well so I would hate to double up resources on the same fix!

Geoffrey
djhomeless is offline  
Old 01-21-2006   #4
ibold
 
Join Date: Dec 2005
Posts: 31
Default

All PMs sent. I gave David a copy of the script if he would like to include a link here for it. In the meantime, I uploaded it to my site:

{unfortunately this link was broken and is now removed}

Enjoy, and let me know if you run into any problems or have questions.
__________________
Directory of Directories << Submit your directory so that other people can find you and add their site!

Last edited by David; 03-23-2008 at 09:52 PM.
ibold is offline  
Old 01-21-2006   #5
djhomeless
Supporter
 
Join Date: Jan 2006
Location: London
Posts: 75
Send a message via AIM to djhomeless Send a message via MSN to djhomeless Send a message via Yahoo to djhomeless
Default

**censored****censored****censored****censored**. Really got my hopes up.

Same result as before:

Quote:
Everything should be finished now. There may be a few strange bugs that I don't know about yet. If you find some then post them here.

You have a total of 0 categories and 1 links.

If you have enjoyed this little script then please consider giving a donation.
Now every time I retry crawl.php, it finishes straight away.

Any ideas??

Geoffrey
djhomeless is offline  
Old 01-21-2006   #6
djhomeless
Supporter
 
Join Date: Jan 2006
Location: London
Posts: 75
Send a message via AIM to djhomeless Send a message via MSN to djhomeless Send a message via Yahoo to djhomeless
Default

Hold the phone.

It can't handle root! I mean the root of Dmoz (/). I just tried a subdirectory and it seems to be working.
djhomeless is offline  
Old 01-22-2006   #7
David
Administrator
phpLD Administrator
Supporter
 
David's Avatar
 
Join Date: Jan 2005
Posts: 11,667
Default

I'll be trying to add this version to 3.0. Understandably, importing from DMOZ, especially from root is an enormous task. We have to start somewhere, and then we'll make improvements as we go. Sound good?
David is offline  
Old 01-22-2006   #8
ibold
 
Join Date: Dec 2005
Posts: 31
Default

Thanks to David for making a donation to help support this
__________________
Directory of Directories << Submit your directory so that other people can find you and add their site!
ibold is offline  
Old 01-22-2006   #9
Neticus
Supporter
 
Neticus's Avatar
 
Join Date: Dec 2005
Posts: 487
Default

Quote:
Originally Posted by djhomeless
Now every time I retry crawl.php, it finishes straight away.

Any ideas??

Geoffrey
Sounds like a similar issue I had with the old mod, I got round this by deleting the previously crawled url from the Dmoz_TABLE as well as emptying the dmoz Category and Links TABLE in phpmyadmin before attempting another crawl. This is regardless of if the convert.php had worked or not.

- Neticus
Neticus is offline  
Old 01-22-2006   #10
resource
 
Join Date: Jan 2006
Posts: 8
Default

Take a look at this
http://rdf.dmoz.org/rdf/content.rdf.u8.gz
Just need to extract it, its better to crawl all the pages, I don't have time to write the script for it, but you can just google "Parsing RDF".

Cheers
resource is offline  
Old 01-22-2006   #11
David
Administrator
phpLD Administrator
Supporter
 
David's Avatar
 
Join Date: Jan 2005
Posts: 11,667
Default

I could probably get someone to write a windows based script that creates all the sql code for insert into tables. And then you would need to import the BIG file using something like big dump.
David is offline  
Old 01-22-2006   #12
djhomeless
Supporter
 
Join Date: Jan 2006
Location: London
Posts: 75
Send a message via AIM to djhomeless Send a message via MSN to djhomeless Send a message via Yahoo to djhomeless
Default

Well, I'm back a square one!

I tried to import just the /Games category. 200k links, 4k categories, and 24 hours later it was done.

Sadly, the hamsters in my pokey server couldn't run fast enough once the links were all added. Granted, its a LOT of links, but whenever a search was run mysql and/or apache would use up to 90% of the CPU.

Without branching this thread, is there a theoretical or logical limit to the number of cats/links that phplinks can handle? I ask because after the add/edit categories link in the backend kept timing out.

If this were a perfect world, zootreeves's script would actually work, and you would be given a chance to cherry pick cats and subcats instead of having to take the entire tree. zootreeves dmoz script works off the assumption that you have the rdf file downloaded locally which should be faster than doing the scrape.

my 2 cents
djhomeless is offline  
Old 01-24-2006   #13
f1gm3nt
 
Join Date: Aug 2005
Location: Chattanooga, TN
Posts: 76
Default

I think that you are trying to sell my work and that kind of pisses me off. I'm very busy and don't sleep much because I am working on a lot of different projects. If you made some improvements great let me know and I'll update the script. However, it's bull **censored****censored****censored****censored** that you are doing this and asking for donations. I've worked hard on this thing and for some it works and others it doesn't. If I knew an ass like you would rip off my work then I would have never posted this in here in the first place.
f1gm3nt is offline  
Old 01-24-2006   #14
ibold
 
Join Date: Dec 2005
Posts: 31
Default

Quote:
Originally Posted by f1gm3nt
I think that you are trying to sell my work and that kind of pisses me off. I'm very busy and don't sleep much because I am working on a lot of different projects. If you made some improvements great let me know and I'll update the script. However, it's bull **censored****censored****censored****censored** that you are doing this and asking for donations. I've worked hard on this thing and for some it works and others it doesn't. If I knew an ass like you would rip off my work then I would have never posted this in here in the first place.
Please read the entire post. I paid someone else to do this. As in real money. I'm only asking for a little bit of that back. I posted it for free download up above. I didn't charge anyone for it. This is the idea of an 'open source' product. I don't think that asking for donations from appreciative people goes against the spirit of this. You posting distastful remarks how ever makes me not want to share just as much as suddenly you don't want to. It doesn't help anything. I at least hope you feel better.
How is this any different from you asking for donations? Is your time or money worth more then mine? You invested something in this and so did I. Just because I started with your work doesn't mean I ripped anything off. I distributed it exactly the same way you did with yours.
How often do you take advantage of other peoples works and ask for donations because you have invested time or money in it? Made any donations or even given credit to Andi Gutmans? To Zeev Suraski? How about Rasmus Lerdorf? Maybe if they knew that a person like you would rip off their work they never would have shared it in the first place.
__________________
Directory of Directories << Submit your directory so that other people can find you and add their site!
ibold is offline  
Old 01-24-2006   #15
djhomeless
Supporter
 
Join Date: Jan 2006
Location: London
Posts: 75
Send a message via AIM to djhomeless Send a message via MSN to djhomeless Send a message via Yahoo to djhomeless
Default

Quote:
Originally Posted by f1gm3nt
I think that you are trying to sell my work and that kind of pisses me off.
You should be thanking users like ibold. He went out and took your script and fixed it, using his own money. There were many forum threads where users like us were searching for an import script that actually worked. I appreciate the time you put into writing your script, but at the end of the day it didn't work, and some of us, myself included, were willing to put our money where our mouths were to get something that did the job.

As ibold suggests do read the thread. He was by no means looking to profit from this, just make back the money (or close to it) that he spent at Scriptlance.

You sir should first watch your language, then offer an apology to ibold. If it wasn't him, it would have been a dozen others.
djhomeless is offline  
Old 01-24-2006   #16
ibold
 
Join Date: Dec 2005
Posts: 31
Default

As far as an apology, it's not necessary. I certainly don't want to bias a member who makes valuble contributions to this script against making more. I was just trying to make a point. If you make something available for free, what I did (what we both did..) is relatively common practice. I do appreciate this mod and your contribution to this community.

On a more productive front, this fix didn't address the issue within the crawl.php file, but rather the convert.php file.
I am currently crawling most of the larger sections of the DMOZ site (by this I mean the main categories on the main page), and assuming nothing breaks over the next day or two on my computer (it happens..), I will be able to just offer SQL dumps to anyone who is interested in them. Right now I have 'Health' and 'Computers', just PM me if either of these interest you and I can get you an SQL dump.
__________________
Directory of Directories << Submit your directory so that other people can find you and add their site!
ibold is offline  
Old 01-26-2006   #17
H2O
 
Join Date: Jan 2006
Location: Australia
Posts: 7
Default

hi,

I keep getting sql error when running covert.php:

SQL ERROR:1136: Column count doesn't match value count at row 1

Any idea?

Do I have to start with a totaly empty database?
H2O is offline  
Old 01-26-2006   #18
H2O
 
Join Date: Jan 2006
Location: Australia
Posts: 7
Default

I tried again a number of times emptied tables and re-crawled but got the same result.
H2O is offline  
Old 01-26-2006   #19
Neticus
Supporter
 
Neticus's Avatar
 
Join Date: Dec 2005
Posts: 487
Default

Quote:
Originally Posted by H20
I keep getting sql error when running covert.php:

SQL ERROR:1136: Column count doesn't match value count at row 1
Previous to installing the dmoz mod you may have installed another mod that added further variables (rows) to your default PLD database. Check for mods that have told you to CREATE or ALTER tables through sql commands. Especially to your PLD_LINK or PLD_CONFIG tables.

Dmoz mod (the convert.php file) works on the premise that you have not changed the original values in PLD database.

Also see this thread;

{link to old forum url removed}

Last edited by David; 03-23-2008 at 10:15 PM.
Neticus is offline  
Old 01-26-2006   #20
H2O
 
Join Date: Jan 2006
Location: Australia
Posts: 7
Default

Hi, thanks for the reply.

That would be the problem then. So if i was to do the crawl first then install other mods would that work?

Thanks
H2O is offline  
Closed Thread


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 01:10 PM.


Powered by vBulletin® Version 3.8.0
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.