Zur Navigation

cgi / perl treibt mich zum Wahnsinn

suche Hilfe

1 Christian

Das lerne ich in meinem Alter nicht mehr... ich schaff ja kaum html ;)

Wer hat Zeit und kann mir ein Perlscript auf meinem Server installieren?


Artikel 19 der UNO-Resolution 217A3 10.12.1948

19.05.2008 13:28

2 Jörg Kruse

Hallo Christian,

die Installation von CGI-Scripten kann manchmal schon eine relativ aufwendige Angelegenheit sein, da es hier viele potentielle Fallstricke gibt. Wenn du uns die Ergebnisse deiner bisherigen Bemühungen mitteilst (z.B. Fehlermeldungen), können wir die mögliche Ursache des Problems schon einmal eingrenzen.

Zur Fehleranalyse eignet sich recht gut auch diese Checkliste auf Selfhtml:


19.05.2008 14:47 | geändert: 19.05.2008 14:49

3 Christian

Naja.... ich bin glaube ich nicht mal bis an die Installation rangekommen. Das ist etwas vollkommen neues für mich, darum suche ich ja jemanden der es für mich macht.


Artikel 19 der UNO-Resolution 217A3 10.12.1948

19.05.2008 14:51

4 Jörg Kruse

darum suche ich ja jemanden der es für mich macht.

OK, dann verschieb ich den Thread mal ins Kleinanzeigenforum ;)

19.05.2008 15:12 | geändert: 19.05.2008 15:12

5 Christian

ok... naja, nicht ok, eigentlich habe ich eine andere Antwort erwartet ;)


Artikel 19 der UNO-Resolution 217A3 10.12.1948

19.05.2008 15:23

6 Jörg Kruse

Was für eine Antwort :)?

Wir können von mir aus auch versuchen, das hier im Forum zu lösen. Dein Beitrag klang halt danach, dass du das nicht möchtest.

Um was für ein Script handelt es sich eigentlich - gibt es dafür eine Installationsanleitung?

19.05.2008 15:35

7 Christian

Magellan heisst das Dingerchen... so etwas wie eine Installationsanleitung scheint es zu geben.... Meine Frage zielte wirklich darauf ab, ob jemand Perl kann und mir das Script schnell installieren würde.


Magellan Metasearch is a collaborative web monitoring platform, and therefore should be deployed on a local network or website ; this manual assumes that the reader is familiar with the installation and administration of common Open-Source components such as Apache, MySQL and Perl ; I also assume the reader has the required permissions to install such a software.

Initial installation

Magellan Metasearch should work on any platform, given the following components are present :

* MySQL 3 / 4 (InnoDB recommended -- concurrency problems might appear with MyISAM),
* Perl version 5.8.x (Magellan hasn't been tested with former versions),
* Apache (the Win32 version seems to have some buffering problems).

The following Perl modules are also required :

* Data::Dump
* Net::DNS
* threads
* Encode
* HTML::Entities
* Time::gmtime
* Mail::Sendmail
* MIME::QuotedPrint
* Thread::Queue
* HTTP::Response
* HTTP::Request
* HTTP::Cookies
* HTTP::Headers
* utf8
* Digest::MD5
* Getopt::Std
* Cwd
* DirHandle
* ...and some other packages I might have forgotten. Perl will complain about those missing packages.

Magellan has been tested on the following platforms :

* Linux / MySQL4 / Apache 1&2
* NetBSD / MySQL3 / Apache 2

The Windows version of Apache seems to be subject to buffering problems, preventing Magellan to output date correctly (for instance, the request progress screen won't be updated until the Magellan process has gone away).

Magellan must be granted full access to the MySQL database -- including the permission to create new databases. The tool will indeed manage its databases in a completely autonomous way. Therefore, for the first installation of Magellan, no specific database need to be created. All databases managed by Magellan will have the "magellan" prefix (as defined in magellan.conf). The "magellan_sys" database contains all system administration parameters (authentication, authorization, customization...). The "magellan_1", "magellan_2"... databases are attached to each user account. The "magellan" database belongs to the anonymous account.

It is necessary to modify several files in order to complete the installation of Magellan :

* magellan.conf : contains all database access parameters, e-mail parameters, core modules setup, and default directories ;
* Magellan/Callbacks.pm : needs to be modified to change the magellan.conf access path -- otherwise, Magellan will not work with cron !!

Magellan must be installed just like any CGI script. See Apache user manual for the correct configuration of CGI scripts.

In command line, the following parameters are available :

* -c : check all saved requests
* -c -r X:Y : check request Y of user X
* -t : check sources availability
* -l : remove all requests' locks (useful after Magellan crashed while checking a request)
* -p : gestion des serveurs proxy (voir plus loin le chapitre sur les proxies)
* -u : upgrade the structure of all Magellan databases from a previous version
* -U : checks for source code updates and automatically downloads them if necessary (still requires manual installation)
* -S : checks for virtual sources updates online, synchronizes existing sources and downloads all new sources automatically

Ideally, search.pl should be called periodically by the 'cron' command, in order to collect newly available search results and test the sources' availability. When using the tool too intensively, your computer might get temporary banned from a search engine ; or, a search engine might want to change its output HTML format, therefore making our source driver unusable. In both cases, it is necessary to test regularly the availability of these search engines, in order not to send them any queries as long as they can't be accessed properly.

Upgrading from a previous version (beta)

Since the first stable release of Magellan, it is possible to force the tool to parse the existing databases, and upgrade their structure when necessary -- this process does not destroy any data, so that you don't have to start using the tool from a blank database if you already installed a previous version. This can be done in command line, calling "search.pl -u" : Magellan will then upgrade all existing databases matching the prefix as defined in magellan.conf. The following operations are then processed :

* creation of missing tables ;
* upgrade of existing tables, without any data loss ; new fields are added, existing fields are altered in case of type mismatch ;
* update of indexes and primary keys ;
* destruction of any temporary tables (MEMORY engine).

During this operation, Magellan outputs lots of non-fatal error messages from the underlying MySQL engine ; most of these messages can be safely ignored, as long as they are related to CACHE tables. The standard output shows the list of operations done.

Attention : this feature is still experimental, and might be buggy in some ways. The upgrade process hasn't been tested with all previous versions of Magellan. If, using this feature, the underlying database is not upgraded correctly, if you feel some data has been lost, or if Magellan doesn't work properly after having been upgraded, please send me an error report including :

* the currently used Magellan version number ;
* the formerly used Magellan version number ;
* the database driver name ;
* the database server version ;
* the complete output log of the "search.pl -u" command.

I also noted some error conditions in MySQL version 4.0.x, using indexes : adding and dropping indexes on a field might make MySQL confused about the real number of indexes in use on a table ; it is therefore possible to exhaust the number of available indexes on a table with ADD INDEX / DROP INDEX commands (the maximum number is 32). Magellan might then complain that it can't create or drop an index. This problem is not fatal and will not make Magellan fail, although it can have some consequences on its performances.

Thanks for also dropping me an e-mail when everything works fine ;)


Up to now, there is no central administration tools for Magellan. All the required administration tasks must be performed directly through MySQL.

1) Users management

Database : magellan_sys
Tables : User, Session, LastSeen
Modules : Magellan::Auth::LoginPass, Magellan::Auth::LoginPass_NoGuest

The User table is extremely important to understand, since it contains all the connection parameters Magellan needs to authenticate users. Its structure may vary according to the authentication module that has been deployed. Additional fields are pretty straightforward. The base field ACCOUNT_ID enables several users to access one specific account, thus sharing their requests and alerts – but with different access levels. This allows for better collaborative work. If this field is left undefined, then Magellan will assume the user has only access to its own account, defined by his user ID.

The LEVEL field is also important :

* LEVEL = 0 : the user doesn’t have any specific priviledge ; he can only modify his own configuration parameters (except management and administration parameters) ; he doesn’t have access to priviledge management (actions / visible sources).
* LEVEL = 1 : the user is a manager, which means he has almost full control over basic users : he can change their configuration parameters (except administration parameters). He cannot change their access priviledges.
* LEVEL = 2 : the user is an administrator, he can change other users’ configuration parameters (including low-level parameters) and access priviledges (available actions and sources).

In previous Magellan versions (< 1.2.2), all users have the lowest access level. It is therefore highly recommended to create an administrator account while upgrading from an older version.

Some more to know about user management :

* Adding a user : insert a field in the "User" table ; the "USER_ID" field must be set to a unique ID, and is used as a foreign key for all other tables. The notion of "User" is only relevant when using the LoginPass authentication module ; when no authentication is needed, then all users fall into the anonymous profile.
* Session IDs don't expire ! The browser-side cookies only expire when being closed. Therefore, old session IDs will continuously get accumulated in the "Session" table ; they are never purged in an automated way. They are however dated, and it is recommended to clean the table from time to time. These IDs must be cleaned with caution : they might still be in used by clients who got a direct access to the system (such as RSS readers or external crawlers).
* The LastSeen table keeps a track of when each user was last seen using the system. This table is useful to identify unused user accounts that can be safely deleted.

Applicable to : Magellan::Auth::IP

* Using the IP authentication module, it is not necessary to manually manage user accounts ; Magellan will create automatically a new account when it detects a new connection from an unknown IP. The corresponding user ID will then be the integer value of the IP address.

2) Authorization management

Database : magellan_sys
Table : Permission
Module : Magellan::Perms::Generic

The default permission model is pretty straightforward. Each user is associated to a record in the Permission table ; its fields control the ability of the user to use the corresponding action -- see the interface module for the list of available actions. Every time a user tries to perform a given action, the system checks the following :

* if no entry exist in Permission for the user, then the default policy is to grant the requested permission to the user ;
* otherwise, if the field "PERM_action" is undefined (or != -1), then the requested permission is granted ;
* otherwise, if "PERM_action" is set to -1, then the request is rejected.

The "_MODULES" field contains the list of search engines the user is authorized to use. The search engine names are separated by a UNIX new line character ("\n"). If the field is left empty, then the user is authorized to use any sources by default.

3) Customization

Database : magellan_sys
Table : Custom

The "Custom" table contains all the information necessary to customize the behaviour of Magellan when sending e-mail alerts. Each field is associated to a user account, so that different users may share different customization settings :

* EMAIL_ADDRESS : default e-mail address where to send alerts ;
* EMAIL_BODY : the HTML skin of the mail (the special string "%s" tells Magellan where to insert the body of the mail) ;
* EMAIL_ABSOLUTE_URL : tells whether Magellan should include URLs as server-side redirections of direct links (useful for example when a client can't access directly the corporate network,but still receives alerts) ;
* DONT_SHOW_REQUEST : don't write the request in the mail (in order to avoid any strategic information leakage)
* DONT_ SHOW_SOURCES : don't enumerate sources (in order to avoid any strategic information leakage)

Moreover, since version 1.2.2, all lower-level configuration parameters (except the DATABASE, AUTHENTICATION and AUTHORIZATION values) are customizable. This enables, for example, different users to access the tool through different graphical user interface modules.

4) Proxy management

Database : magellan_sys
Table : Proxy

Since version 1.1.0, Magellan is able to manage proxy lists in an autonomous manner, which includes the following operations :

* Import a proxy list from an external Web source ;
* Deactivate unreliable proxies and only keep the best ones ;
* Activate inactive proxies from time to time in order to test them periodically ;
* Periodically truncate availbility and speed statistics to avoid getting polluted with averages hiding more recent availability changes.

These operations can be triggered with the following command-line options :

* search.pl -p "reset stats" : reset all speed and availability statistics
* search.pl -p "reset all" : reset all stats and reactivate inactive proxies (field INACTIVE = 1)
* search.pl -p "purge 0.7" : deactivate all tested proxies that are available less than 70% of the time ; marks definitely inactive proxy servers that have been tested more than 10 times but have never answered
* search.pl -p "http://www.serveur-web.com/liste-proxy.php" : extracts proxies from an external data source (format : address:port)

These operations should be scheduled with the local cron, in order to let Magellan manage proxies alone.
Zumindest verstehe ich dies unter Anleitung. Gibt es auf Wunsch auch auf Französisch.

Artikel 19 der UNO-Resolution 217A3 10.12.1948

19.05.2008 15:50

8 Jörg Kruse

Ich habe mir das Script mal von SourceForge heruntergeladen

Als erstes musst du wohl die Datei magellan.conf bearbeiten.

Zur Installation würde ich zumindest folgende Parameter anpassen:

Hier die Datenbankzugangsdaten:

					USER		=>	'root',
					PASS		=>	'',
					DATABASE	=>	'magellan'

Hier den Server-(installations-)Pfad sowie die vollständige URL von search.pl:

		DIRECTORY				=>	'/home/htdocs/intranet/cgi-bin/',
		URL						=>	'',

... und hier noch die Emailadressen, die du verwenden möchtest:

		MAIL_FROM				=>	'magellan@domain.com',
		MAIL_ADMIN				=>	'admin@domain.com',

Danach sollen die Pfade von magellan.conf in den Dateien Magellan.pm und Callbacks.pm angepasst werden.

Ich würde hier auch einfach den absoluten Server-Pfad von magellan.conf angeben:

my $conf_file = 'magellan.conf';

19.05.2008 16:36 | geändert: 19.05.2008 16:38

9 Christian


DIRECTORY				=>	'/home/cgi-bin/',
		URL					=>	'http://yomada.net/cgi-bin/search.pl',
my $conf_file = 'home/cgi-bin/magellan.conf';

Magellan.pm habe ich nicht gefunden, in welchem Verzeichnis hast du die?

Ansonsten erledigt...

soweit war ich fast auch schon, die Serverpfade in den anderen Dateien habe ich nicht angepasst...

Error 404!

File Not Found!

19.05.2008 18:25 | geändert: 19.05.2008 18:28

10 Jörg Kruse

Der Serverpfad sollte mit einem Slash beginnen

my $conf_file = '/home/cgi-bin/magellan.conf';

Das Verzeichnis /home ist demnach das Verzeichnis von yomada.net?

Magellan.pm habe ich nicht gefunden, in welchem Verzeichnis hast du die?

Finde ich auch nicht :/ - vielelicht war mit dem / auch ein "oder" gemeint und es gibt dann in diesem Fall nur die Callbacks.pm

Ich erhalte übrigens bei Aufruf von http://yomada.net/cgi-bin/search.pl ein 500er - keinen 404er!

Ein 500er könnte dadurch ausgelöst sein, dass der Perl-Interpreter nicht gefunden wird

In der ersten Zeile von search.pl ist dieser Pfad in der sogenannten Shebang angegeben:


Überprüfe mal, ob "/usr/bin/perl" der korrekte Pfad auf deinem Server ist - andernfalls musst du diesen korrigieren

19.05.2008 19:02 | geändert: 19.05.2008 19:03