Wednesday, February 10, 2010

Find Duplicate Files Suite (Initial)

FindDup Suite

FindDup Suite contains 3 parts:
- dupselect.php
- is the duplicate file finder which will search duplicate files in
current directory or specified directory. Progress will print to STDERR and
duplicate file list will print to STDOUT.

$ perl [Path] > [duplicate-files.txt]

dupselect.php is a PHP web application for helping user to select duplicate
files. Copy/move [duplicate-files.txt] to same directory as dupselect.php in
the web server. For example, creating a web server with PHP in localhost, then
copy dupselect.php and duplicate-files.txt to /var/www/html, and then fire up
a web browser(such as Firefox) and go to http://localhost/dupselect.php and
you will have a screen like this:

duplicate-files.txt DupSelect CalcTotal

The DupSelect link will go to selecting duplicate files using
duplicate-files.txt duplicate file list.
The CalcTotal link show you a list and calculate Total size of all files in the

In DupSelect mode, you will see a group seperated by horizontal line like this:


[ ] ./20080428/kaberen22204.jpg (f66f689f)
[X] ./20080427/kaberen22204.jpg (f66f689f)

[ ] ./20080105/T2AW_WP_3/T2AW_10/T2AW_10z.psd (274a3199) (*)
[X] ./20080910/Al_4575/T2AW_10/T2AW_10z.psd (274a3199) (**)
[X] ./20071124/T2AW_WP_2/T2AW_10.psd (274a3199)

[ ] ./20080506/moeura26162.png (37645a01)
[X] ./20081228/moeura46760.png (37645a01)

The first file in each group is kept by default. dupselect.php will think
the file in deeper directory is more important.
There is two marks, (*) and (**). (*) repersents there is a file in deeper
directory in this group while (**) means there is more than one file in
deeper directory.
The selected files will generate a duplicate-files-delete.txt and
duplicate-files-delete.lst file in server after pressing [generate] button.
duplicate-files-delete.txt is for you to view a report using CalcTotal mode.
duplicate-files-delete.lst is for deleting using

Inside dupselect.php:
There is some variables in dupselect.php for the sorting and selecting

$excludes_order = array('detail');
$includes_order = array('waren','kaberen','moeren','kabeura','moeura');
$deselects = array('this-one-needs-duplicate');
$normal_depth = 2;

The $excludes_order variable controls which file should place in the back.
The $includes_order variable controls which file should place in the fronter,
but the file in deeper directory still have higher priority.
The $deselects variable controls which file should not be selected in
DupSelect mode.
The $normal_depth variable controls which file become more important. is an utility for deleting files using a .lst list file.
$ perl [duplicate-files-delete.lst]


vnc2flv-20100207 Win32

Compiled with Py2Exe/Python 2.6 with a modified
Download: vnc2flv-20100207.7z
#!/usr/bin/env python
import psyco

import os, sys
from distutils.command.build_ext import build_ext

 # load py2exe distutils extension, if available
 import py2exe
except ImportError:

from distutils.core import setup, Extension
from vnc2flv import __version__

py2exe_options = dict(includes=['flvscreen'], # Include
                      excludes=['_ssl',  # Exclude _ssl
                                '_hashlib',"bz2", "_ctypes",
                                'doctest',"pywin", "pywin.debugger", "pywin.debugger.dbgcon",
                                "pywin.dialogs", "pywin.dialogs.list",
                                'pickle', 'calendar'],  # Exclude standard library
                      dll_excludes=['msvcr71.dll'],  # Exclude msvcr71

  description='Screen recording tool that captures a VNC session and saves as FLV',
  long_description='Vnc2flv is a screen recorder. It captures a VNC desktop session '
  'and saves it as a Flash Video (FLV) file.',
  author='Yusuke Shinyama',
  author_email='yusuke at cs dot nyu dot edu',
  keywords=['vnc', 'flv', 'video', 'screen recorder'],
    'Development Status :: 4 - Beta',
    'Environment :: Console',
    'Intended Audience :: Developers',
    'Intended Audience :: Science/Research',
    'License :: OSI Approved :: MIT License',
  options={'py2exe': py2exe_options},
  console=['tools/', 'tools/', 'tools/', 'tools/', 'tools/'],
  zipfile = "vnc2flv.lib",