CGIProxy (see http://www.jmarshall.com/tools/cgiproxy/) ======================================================== CH'CH'CH'CH'CHANGES: -------------------- 2.1.6, released February 4, 2013: -------------------------------- Now can run as a FastCGI script. Now can run without an external HTTP server, by using its own embedded secure HTTP server. Installation is easier, as Perl modules can be automatically installed (including under your home directory) by running "./nph-proxy.cgi install-modules" from the command line. See the $LOCAL_LIB_DIR config option, if you need to install the modules and you're not root. Windows support has improved. Documentation has been improved, especially for installation. Command-line usage is now documented; run "./nph-proxy.cgi -?" for usage. There are some new config options, mostly for FastCGI support, the embedded server, and database support. Some of the configuration section has been rearranged; most potentially needed config options are now near the top. Fixed a bug handling spaces in path when using proxy_encode(). 2.1.5b, released November 10, 2012: ----------------------------------- Added redirection for Gmail to %REDIRECTS ; redirects to HTML-only version. 2.1.5, released October 21, 2012: --------------------------------- Now optionally uses a server-side database to store cookies, which fixes "Bad Request" errors when user has too many cookies. Can use either MySQL or Oracle. Configure this with $DB_DRIVER, $DB_USER, $DB_PASS, and $USE_DB_FOR_COOKIES . Now supports a simple mechanism to automatically redirect pages that aren't handled well by CGIProxy. For example, www.facebook.com is redirected to m.facebook.com (mobile), until we can get www.facebook.com working better. This is configured with the %REDIRECTS hash. 17 bugs fixed, mostly in JavaScript support but some in Flash and HTML support too. 2.1.4, released May 8, 2012: ---------------------------- Fixed a bug with chunked responses, making captcha work better. Closed some privacy holes in JavaScript and Flash support. Fixed some small bugs, making more pages work better. Added "Delete all cookies" link to top of cookie management screen. 2.1.3, released April 27, 2012: ------------------------------- Improved Flash support, including better support for online video. No delay with YouTube anymore. Improved support for Same-Origin Policy (browser security). Other security fixes. Other fixes and workarounds, making more pages work correctly. 2.1.2, released April 8, 2012: ------------------------------ Added "download" link to footer. Fixed "SSL options" bug. No longer fails in Windows because of getpwuid() call. Various security fixes. Various small fixes to make more pages work better. Internal: rewrote _proxy_jslib_handle() and _proxy_jslib_assign() more cleanly, using _proxy_jslib_instanceof() instead of the error-prone _proxy_jslib_object_type() (which was removed). Still works around many browser bugs. :P 2.1.1, released January 19, 2012: --------------------------------- $ENCODE_URL_INPUT is now on by default, and is fixed to work better in all major browsers. Link to CGIProxy home page in the footer is now proxified. Now supports jQuery-based sites better. Added support for a few non-standard HTTP request headers. Works better with captcha. $NO_COOKIE_WITH_IMAGE is now off by default, to support captcha. Many small fixes and workarounds, including several privacy and security fixes. 2.1, released December 9, 2011: ------------------------------- Flash 9 and later is now supported, which among other things means YouTube works through CGIProxy again. Changed flag segment of full URLs to something less obvious. Now supports "data:" URIs. Now supports ECMAScript (JavaScript) version 5. Many fixes in JavaScript support and elsewhere. No longer eats memory when connecting to secure servers. $PROXIFY_SWF is now set by default. Now uses the Encode module instead of the old utf8:: stuff. Added two config options $PROXY_DIR and $ALLOW_RTMP_PROXY, though they're not used yet. 2.1beta19, released December 25, 2008: -------------------------------------- Fixed a couple of bugs with cookies, so they (including logins) should work better. Fixed so that Safari no longer chokes on StorageList handling. Various other small fixes. 2.1beta18, released August 10, 2008: ------------------------------------ Certain pages were very slow in MSIE due to the way MSIE implements Array.pop(), so a lot of code was rewritten to avoid using Array.pop(). Those pages now function at normal speed in MSIE. Some pages see an improvement of 10x or better. proxify_js() is now about 5% faster due to reworking of $div_ok setting. Now handles "application/xhtml+xml" content correctly. $REMOVE_SCRIPTS and $HIDE_REFERER both now default to 0 (false). Various small fixes, workarounds, and cleanup. 2.1beta17, released March 11, 2008: ----------------------------------- Fixed a couple of bugs with SWF (Flash) support. In particular, tags are now proxified correctly when in certain s, and thus youtube.com now works when using MSIE. 2.1beta16, released March 3, 2008: ---------------------------------- Includes a ton of fixes and workarounds in JavaScript support, making more sites work through it. Includes a fair amount of performance improvement too. Added experimental support for Shockwave Flash (SWF) apps. If you set $PROXIFY_SWF=1, the script will proxify the SWF bytecode so that any network accesses go back through the same script. It works with many but not all Flash apps. Sometimes it can slow down a page, if the page has e.g. lots of Flash ads. Fixed port-handling in HTTP Basic authentication support. Several other small bug fixes, cleanup, and comments. 2.1beta15, released October 26, 2006: ------------------------------------- Fixed bug handling "javascript:" URLs that was causing some sites to fail. Fixed bug in _proxy_jslib_cookie_encode() and _proxy_jslib_cookie_decode(), in the commented-out rot-13 line. Cookies should now work when using rot-13. "about:blank" pages no longer generate the "WARNING:" page. In the start page, the URL entry field now initially has focus (only if JS is running). Added support for Element.getElementsByTagName to _proxy_jslib_handle() (though it's done in the form of "Node.getElementsByTagName"). Now explicitly defaults the first argument of "Document.open()" to text/html. This is supposed to happen anyway, but not all browsers do it correctly when writing to a frame. Proxification of top-level "return" statements should now work better in a couple of ways. Several more small fixes and cleanup, making more sites work. 2.1beta14, released October 16, 2006: ------------------------------------- Better handling of "javascript:" URLs. Now correctly handles "delete(...)" . Various browser bugs/crashes are now trapped by try/catch blocks so they're not fatal. More erroneous HTML and JavaScript is now worked around. Many more small fixes and workarounds. 2.1beta13, released September 12, 2006: --------------------------------------- The $TRANSMIT_HTML_IN_PARTS config variable has been removed; now you only need to set @TRANSMIT_HTML_IN_PARTS_URLS . Use of both was redundant. Now correctly handles UTF-8 content. Authorization cookies are now associated with the server and port, rather than just the server. Worked around MSIE bug that shifts centered content to the right. Now works around more erroneous HTML, JavaScript, and browser behavior. Other minor fixes. 2.1beta12, released June 6, 2006: --------------------------------- Now optionally supports compression (gzip) of message bodies, if the Compress::Zlib Perl module is installed. This also helps with certain server bugs. To help with pages that are returned in parts with long delays between those parts, you can now use the $TRANSMIT_HTML_IN_PARTS and @TRANSMIT_HTML_IN_PARTS_URLS options to have CGIProxy process and return each piece of HTML as it receives it rather than wait for the whole page. This helps with certain library database queries, for example. Further improved CSS handling, including better display of the top form. Improved handling of "&#nnn;" HTML entities. Better updating of the top form when using frames. When $TEXT_ONLY is set, there are no longer any wasteful attempts to get the images (unless $RETURN_EMPTY_GIF is set). Also, $RETURN_EMPTY_GIF now defaults to 0 . Now handles JavaScript's deprecated with() statement. Now handles Document.referrer . Cookie values are now allowed to have spaces in them, even though that's technically illegal; unfortunately, some sites require them to work. If this causes any problems, please let me know. Other minor fixes. 2.1beta11, released March 14, 2006: ----------------------------------- Improved (though not yet perfect) CSS handling. Now uses a different system to track which elements have been proxified. (PLEASE tell me if you find any privacy holes, i.e. when the browser makes a direct connection to the end server when it should be going though CGIProxy.) Fixed a hang when using MSIE and IIS. HTML (e.g. innerHTML) is no longer doubly-proxified when using "+=". Improved handling of &#nnn; -type entities. No longer inserts ""; it was becoming more trouble than it's worth. Better handling of data retrieved through XMLHttpRequest. Cleaner support of Document.domain . Many other minor fixes, workarounds, and cleanup, mostly in JavaScript support. 2.1beta10, released December 6, 2005: ------------------------------------- Now supports UTF-16 pages, if using Perl 5.8.0 or later. Now supports query-only URLs. Window.open with a relative URL is now handled correctly. In JavaScript, (non-standard) octal numbers and characters are now handled. Now allows (erroneous) commas in cookie values, created by at least one server. Now allows (erroneous) line terminators in JavaScript string literals if preceded by a "\", since browsers seem to allow it. Many other minor fixes, mostly in JavaScript support. 2.1beta9, released October 27, 2005: ------------------------------------ Cleaned up URL handling in proxy_encode(), proxy_decode(), and their JS counterparts. Those routines are now back to their old format in which they contain only user-configurable statements. The "do not remove" code has been moved into wrapper functions; anywhere proxy_encode() is called should now call wrap_proxy_encode() instead, and the same is true for the three other related routines. @BANNED_NETWORKS now includes the whole 127.x.x.x subnet. Cookies are now handled more like browsers handle them (though not as the spec calls for), in a couple of ways. In the JavaScript code, reworked handling of "delete", preincrement, and predecrement. Those operators are now handled much better. Other minor fixes. 2.1beta8, released August 23, 2005: ----------------------------------- Now more properly encodes all "?", "#", and others in default proxy_encode(). This includes further working around aforementioned Apache bug with PATH_INFO. Now supports Accept-Language: header in requests, which means that pages will more often be returned in the expected language. Added support for query-only URLs, i.e. those beginning with "?". Don't know why this never came up before. A couple fixes and workarounds in JavaScript support. Other minor fixes. 2.1beta7, released June 28, 2005: --------------------------------- Added a "Report a bug" link in the top form, which will hopefully result in more bug reports. Now supports (erroneous) JavaScript that contains the string " tags, to allow clicking on text. Now can encode URLs before submitting them, if you set $ENCODE_URL_INPUT. Now supports external tests for valid user IP address ($USER_IP_ADDRESS_TEST) and valid destination server ($DESTINATION_SERVER_TEST). Both may be either a command-line program, or a CGI script on a remote server. These features were added at the request of (and paid for by) the International Broadcasting Bureau. No longer times out after 10 minutes, which will help with large files. Added some more appropriate values to @BANNED_NETWORKS. Cookies with no path specified now use a path of "/", even though that violates the spec, because that's how browsers treat them. If you want to follow the spec instead, you can set $COOKIE_PATH_FOLLOWS_SPEC. Illegal cookie domains that contain only two dots but are not in one of the seven main TLDs are now allowed by default, because that's how browsers behave. You can follow the spec instead by setting $RESPECT_THREE_DOT_RULE. Many other minor fixes, and workarounds for browser bugs and buggy sites. 2.0.1, released November 19, 2002: ---------------------------------- This release improves compatibility and installability in a few environments; it doesn't really add new features. In particular: An SSL proxy is now supported with the $SSL_PROXY option, analogous to $HTTP_PROXY. Authentication for it is handled with $SSL_PROXY_AUTH, analogous to $PROXY_AUTH. The $RUNNING_ON_WINDOWS option is no longer needed, and so no longer exists-- relevant code now determines the OS automatically when needed, or is handled another way. Sockets now work correctly on BSDI and possibly other systems, where before the script generated messages like "Address family not supported by protocol family" and possibly others. It works now because socket address structures are created with the general pack_sockaddr_in() and inet_aton() functions from the Socket module, instead of the traditional hard-coded "pack('S n a4 x8')" method. The new way is more "correct", given that we're already loading the Socket module anyway. It should help for future IPv6 support too. All page insertions, including the initial "" comment, are now inserted *after* any initial declaration, to avoid confusing MSIE 6.0, which does not allow comments before an initial . The problem showed up as subtle errors in page elements (like fonts, etc.) when using MSIE; some or all of these problems should go away now. Cache behavior when using a caching HTTP proxy should be correct now, because Pragma: and Cache-Control: headers (if available) are now passed through to the outgoing request. Before, caches would not always refresh as expected, so page reloads didn't always work. Fixed a compilation error when using certain older versions of Perl. 2.0, released September 18, 2002: --------------------------------- This is a MAJOR release, even though most of the changes are internal. Some things about the 2.0 script are fundamentally different from the 1.x series. Here's a list of changes, roughly categorized: ---- Visible new features and changes: ---- Now supports SSL, i.e. can retrieve pages on secure servers. For this to work, the separate packages OpenSSL and Net::SSLeay must be installed. If they are not installed, then CGIProxy still works but cannot download pages from secure servers. Also, it is strongly recommended to run CGIProxy on a secure server when SSL is supported, or else secure data will be compromised on the link between the browser and CGIProxy. The top entry form now has an "UP" link, which links to the parent directory of the current URL. The idea comes from the (quite useful) button in Konqueror. Handling of top insertions has been cleaned up-- FTP directory listings now include the correct insertions, and other cleaner behavior. ---- New or changed config options: ---- $RUNNING_ON_SSL_SERVER is now a three-way option: If it's set to '', then assume an SSL server if and only if port 443 is being used. Besides being a good default, this lets you put the script where it can be served by both a secure server and a non-secure server. Perl 4 is no longer supported. CGIProxy should run fine with Perl 5.004 or later. Better yet, upgrade to Perl 5.6.1 or later if you can-- future versions of CGIProxy may require that for some features. FTP requests now use passive (PASV) mode by default, though you can use non-passive mode by setting $USE_PASSIVE_FTP_MODE=0. $HTTP_PROXY and $NO_PROXY are now used instead of $ENV{'http_proxy'} and $ENV{'no_proxy'}. For VPN-like installations, there is now a $QUIETLY_EXIT_PROXY_SESSION option that allows a smooth transition from browsing an intranet through the proxy to browsing external sites directly, without getting intermediate warning screens. It's not meant for any situation where anonymity is important. See the comments where it is set for more details. Set the new $SESSION_COOKIES_ONLY option to make all cookies expire when the browser closes. Set the new $MINIMIZE_CACHING option to minimize any caching that may be done by the browser, by using appropriate HTTP response headers. Cacheability of various responses has also been cleaned up in general. If for some reason you want content (e.g. HTML) inside comments to be proxified like the rest, set the new $PROXIFY_COMMENTS option. Improved the sample list of ad servers in @BANNED_IMAGE_URL_PATTERNS. Cleaner and slightly different handling of $INSERT_HTML and $INSERT_FILE; see comments in user config section for details. $FASTER_HTML_LESS_PRIVACY no longer exists, because it's no longer relevant with the new HTML-parsing structure (see below). ---- For programmers: ---- SSL support was implemented as a package that implements a tied filehandle, called "SSL_Handle". The idea came from the Net::SSLeay::Handle module, which for a couple reasons wasn't suitable to use directly. The SSL_Handle package may be useful in other SSL applications, though this version is not a full implementation. It does buffer its input, though, which can make it a much faster alternative to Net::SSLeay's ssl_read_until() routine. The entire section of code that modifies an HTML response (roughly 20-25% of the whole program) has been completely rewritten from scratch, and has a new structure. It's MUCH cleaner and slightly smaller than before, and is now encapsulated in the routine proxify_html(). It handles the heterogeneity of HTML correctly, i.e. non-HTML content that can exist within HTML (like scripts, stylesheets, comments, or SGML declarations) is correctly separated out and handled according to type. Also, tags are fully parsed into attributes and rebuilt if needed, instead of hacking it with long regular expressions as before; this removes a family of potential bugs and a lot of messy code. Overall, the results are more accurate, and the new structure is much more solid, flexible, and extensible, and much easier to work with than before; it should let us solve any new problems in the "right" way when they arise. :) :) Many other sections of code have been partially or entirely rewritten, and behavior is generally cleaner. Global variables are now handled much more cleanly, especially regarding their persistence when using mod_perl. Variables are now (almost) completely divided between UPPER_CASE constants which retain their values between runs, and lower_case variables which are reset for each run. After the user config section is a constant initialization section; both are run only during the first run of the script, and are skipped for efficiency during subsequent runs under mod_perl. The config and initialization sections have been arranged more carefully than before, for clarity and other reasons. Several variables have had their semantics clarified. NOTE: If you modified the code in an earlier version and used certain config variables, you should review the new code before inserting the same changes. For example, a variable like $REMOVE_SCRIPTS should normally be replaced by $e_remove_scripts; the former is the config setting that never changes anymore, while the latter reflects the value used for this run of the program (which the user might change via a checkbox). There are several similar variables. Also, avoid modifying UPPER_CASE variables, because those will retain their values between runs under mod_perl... unless that's what you want. ---- Other stuff: ---- The HTTP client in CGIProxy now uses HTTP/1.1 if the browser uses HTTP/1.1. Before, only certain HTTP/1.1 features like the Host: header were supported; now, all required HTTP/1.1 client features are supported, so CGIProxy is "conditionally compliant" with HTTP/1.1 regarding its client functions, as per the HTTP spec. CGIProxy can't control all server functions, but for those that it does (such as the Date: header), it complies with HTTP/1.1. Improved detection of text vs. non-text when supporting $NO_COOKIE_WITH_IMAGE, which makes some pages behave better. Many changes to take advantage of Perl 5, such as "use strict", better regular expressions, references, and cleaner code all over the place. Various other bugs fixed, privacy holes closed, performance improvements, UI improvements, code rearrangement, and cleanup. 1.5.1, released February 7, 2002: --------------------------------- Headers are no longer split on commas, which among other things means that cookies work again. :P A couple other minor bug fixes. 1.5, released November 26, 2001: -------------------------------- Many changes this time around, some major, some minor. The code is about 50% larger. Here's a list of changes, roughly categorized: ---- Most visible new features and changes: ---- A new cookie management screen lets the user view and selectively delete any cookies being sent through the proxy. On pages with frames, any insertion (such as the small URL entry form) is now in its own top frame. Cookies may now be encoded with the cookie_encode() and cookie_decode() routines, similar in concept to proxy_encode() and proxy_decode(). ---- Making more pages work: ---- Referrer information, required by some servers, is now optionally sent to the server. Certain pages with Flash or other embedded objects now work better. HTTP URLs which include authentication in them (e.g. "username:password") are now supported. Links to "javascript:" URLs are handled in a more friendly way. Inline frames (i.e.