Category Archives: Browsers

When Do We Need MySQL Databases With Our Web Hosting?

Whenever you buy web hosting, your hosting provider is sure to include an option called “Databases” in his price quotation or package details. But not many of us know what are databases and why we need them when setting up a website. This article aims to explain a couple of reasons why we use databases and when we need them for our website.

The database systems we usually get with our web hosting accounts are all relational database systems or RDBMS. If we take a Windows hosting server we usually get Microsoft SQL aka MSSQL and if we take a Unix based system we usually get MySQL. Both these systems put your website data into tabular layouts for fast and easy retrieval.

Databases are a collection of rows and columns, where each row or column is related to the others in some way. Due to this relation, information from this database can be retrieved quickly and efficiently as compared to if the data was stored in any other unstructured format. You can consider a database as a collection of multiple spreadsheets which are all related to each other in some way or the other.

Security

A major advantage of having your data stored in a database is the ability to protect the database from unauthorized access, or the ability to protect a database record from being tampered with. A simple example would be the storage of login credentials. In a simple way, login credentials can be stored in a simple text file and be read by your script which does the authentication. However, your file will always save the password stored in plan text, making it vulnerable to anyone who has access to it. Databases have the ability to encrypt the entries using a hashing algorithm to obscure them. Further, to read or write to the database, you need to specify login details, something which is not easily possible with an ordinary text file or spreadsheet.

Efficiency

Due to the manner in which data is stored in the database, retrieval and storage is most efficient. Unlike a spreadsheet, where anything and everything can be stored in any cell, SQL restricts the type of information stored in each row or column based on certain conditions. This validation is inherent in the properties of SQL and is an added advantage for a programmer or developer. Due to the various inbuilt mechanisms that are present in RDBMS systems, they are the most preferred method of storage and retrieval of data.

An article on the DatabaseJournal Blog explains this in a lucid way:

“…The problem with text files is during a read, if the text file is large, it can take quite a bite of time to open and scan the contents of the file looking for what we want. Also, if we wanted to see all the sales to a specific customer, the entire text file would have to be read, and every line occurrence of the customer name would need to be saved in some temporary place until we had them all. If we saved to a spreadsheet instead of a text file, we would have a Sort feature built in. So we may be able to find all the sales to a specific customer quicker, but again, if the file was large, opening the spreadsheet could take a great deal of time.”

Administrative Control

Database systems also have various control mechanisms which allow easy administration of the database and tables within it. The most important part is user management. Each user can be given specific rights to the database, thereby allowing limitation in access so as to secure the data from being tampered, modified or altered. Database Administrator’s can tweak many settings of the database thereby restricting the flow of data, the connection speeds, maximum number of connections to the server and even adjust what hardware resources should be allocated for performing a task.

Tips To Prevent Failed Backups Of Your Website

Tips To Prevent Failed Backups Of Your Website

Tips To Prevent Failed Backups Of Your Website

Most of us keep regular backups of all our important data, especially which is on the internet. Websites should also be backed up regularly as a safety measure in case of a disaster and also as a version tracking mechanism. This ensures that in the event of our website being hacked or becoming corrupt, we can quickly restore it to a prior version. Most of us have automatic mechanisms to take a backup. But little do we realise that taking a backup doesnt really end there. We need to ensure that it is a workable backup and not a dummy file which has not value. This article explains how we can prevent backup file failures and how to ensure that our website backup will be of use when we need it.

Elements

An important aspect of taking backups is not to forget any of the elements that make up your account. Your web hosting account has several elements which need to be backed up so that you can restore your entire account. This means that you need to backup things like Email, Website Files, Databases, Configuration Files, Web Statistics, Email Forwarders and any other customizations that you may have done with any of these elements. Hosting Control Panels like cPanel or Plesk may allow you to take backups of these elements separately or compile them into a single compressed file. Forgetting even one part of your account backup may render your website useless. Many people forget to take a backup of their database files since the backend is not something that the users interact with. Similarly, the small configuration files and customizations can also boost the speed and efficiency when you attempt to restore the backup.

Disk Space Shortage

The in-built backup options of your web hosting control panel will usually save your backup in the home directory of your account. This means that the backup will also consume hosting space within your account. If you are on a limited space plan, you should have atleast 50% free disk space before you backup your account. If you do not have enough space, the backup may not complete fully or may get corrupt. This can be disastrous when you are dependent on it and when you want to restore the account. Ensure that you have enough free space so that your backup is not stuck.

Failed Download

One of the most common failure points when taking a backup is the download to your local machine or onto the media you are ultimately backing up on. Very often, when the backup file is being downloaded on the media or machine, the network connection may drop or the process maybe terminated. This may indicate that the file has been downloaded, but actually the file is only partially downloaded. A client of ours who was switching from shared hosting to a dedicated server was taking a backup of his website on his own. He started downloading the backup file from his hosting account. The size of the backup file was about 600 MB. On his internet connection, it indicated that about 8 minutes were left for the download to complete. In between the download process, his internet connection dropped. He noticed that the file transfer had stopped and file was on his hard disk. However, he did not notice the file size and assumed that the entire 600 MB has been downloaded. The backup file was corrupt and had downloaded only 350 MB. Luckily he noticed the mistake and was able to retrieve the backup file. You may not always be so lucky.

Integrity Check with the Checksum

The best way to ensure your backup file’s integrity is using a checksum tool to verify the checksum of the file before and after it is downloaded. Ideally this means that you get a unique hash value of the file from the web server. This unique hash value is tied to the file and is almost impossible to reverse engineer. Then, after you have downloaded the file, retest for the checksum of the downloaded file on your backup media. The checksum of the file should be exactly the same. If the file is corrupt or tampered or has not downloaded properly, the checksum will mismatch and you will get to know easily. This ensures data integrity and provides assurance of a healthy backup file.

 

Web Hosting And Net Neutrality In A Nutshell

Web Hosting And Net Neutrality In A Nutshell

Web Hosting And Net Neutrality In A Nutshell

Net Neutrality is a phrase which is being sprayed all over the internet without providing a simple and clear explanation to the layman. Web Hosts are as affected by Net Neutrality decisions just like any ordinary internet user. Many people are quick to point out that Web Hosts also practice data discrimination and should be placed on the same guillotine as ISPs. However, this is not the case as there is a huge difference in providing Bandwidth and in providing a web hosting service. This article points out a couple of reasons why the Web Hosting industry also embraces Net Neutrality with open arms and why a comparison maybe equal to apples and oranges.

Low Entry Barriers

Unlike Internet Services in any country, it is pretty simple for a person to start a web hosting business. To become an ISP, there are several legal and business hurdles that one must cross. It’s not an easy task and needs deep pockets and great influence to start the business. It is on par with a Telephone Company or Electricity Supplier. On the other hand, starting a web hosting business can be done with little or no investment. Web hosting resellers don’t even need to own a server or commit any sales to start their business. A small or medium host can spend a couple of dollars every month and lease a hosting server. This means that to enter into this industry the cost is very low and regulation is negligible. There is no Government control or paper work that needs to be done to sell hosting space. This automatically increases the number of players in the industry.

Intense Competition

The low entry barriers cause such intense competition, that the ultimate beneficiary is the end user. If one web host starts acting restrictive, people can easily switch to another one without much ado. Due to customer-facing side of the business being virtual and online, there is easy accessibility and no physical hindrances to switching. However in most places around the world, the number of ISPs are miniscule and in rural areas there may not even be more than one or two. This kills the competitiveness of the business and gives a dominant position to the ISP, which can be misused to exploit subscribers. Often, when there are only a couple of ISPs operating, it leads to a sort of cartel and price fixing which cannot be questioned. Anti-competition laws around the world prohibit such behavior and try to break up this dependence.

Cut-throat Pricing

Besides striving to offer the best service in terms of speed and resources, every web host aims to offer the best price too. The price-war is so aggressive that even customers find it hard to make a decision as they are spoilt for choice. This is exactly what ISPs don’t want. ISPs want to create restrictions among different website services or online facilities by throttling the popular ones and charging a premium for them.They may either recover the premium from the customers or may ask the web services to cough up that money. If a web host throttles traffic for a certain type of domain name and prioritizes traffic for one that he is selling Eg: .co domain names, then he will soon be out of business because all his clients will switch. As we have learnt from free online services, the consumer doesn’t like to be restricted. The more restrictions you place, the more they will shun your service.

Security and Convenience

Web Hosts do restrict certain type of traffic and also do some filtering, but this is not to create unfair competition or to get a dominant position advantage, but to ensure security and stability of their services. Eg: A host may limit simultaneous FTP connections from a single IP address to a maximum of 50. This is to prevent abuse of their FTP server and to ensure that other users on the same shared server are able to enjoy the service too. ISPs want to filter traffic to commercially exploit the end user, as they know that the end user has limited options for accessing the internet.

Google Hates These Things You Do…

There are a couple of things that Search Engines look at when ranking your website in search results, for various keywords. These include original content, number of links pointing from other website’s to yours, your overall online presence, the quantity of relevant content on your pages. People have tried long and hard to manipulate Google’s search algorithm by trial and error of various methods. The bottom line being that you just can’t fool Google. Although there maybe more than 2000 factors which determine how your site is ranked, there is a sure-shot list

Google Hates These Things You Do...

Google Hates These Things You Do…

of points which Google hates and will punish you if you do those things. This article aims to highlight some key points which you should definitely avoid if you want to be in Google’s good books.

Plagiarized Content

Google hates copy cats. The whole purpose of Google is defeated if it is not able to serve up fresh, genuine and relevant content for a user’s search query. When a person look for something using Google, he is dependent on Google to provide the most accurate websites which will provide the information sought. If Google does not catch the attention of the searcher, he will look for the information elsewhere and Google would have lost out on potential ad revenue in the process. Google says:

“Purely scraped content, even from high-quality sources, may not provide any added value to your users without additional useful services or content provided by your site; it may also constitute copyright infringement in some cases. It’s worthwhile to take the time to create original content that sets your site apart. This will keep your visitors coming back and will provide more useful results for users searching on Google.”

Dummy Websites

To get back links from external websites, webmasters buy many domain names and put up rudimentary content with a link back to the parent website which they want to boost in search results. Google calls these doorway pages and penalizes doorway sites themselves and also websites using doorway pages. Since these websites or pages are “typically large sets of poor-quality pages where each page is optimized for a specific keyword or phrase” they offer no value added benefit to Google users.

“Google frowns on practices that are designed to manipulate search engines and deceive users by directing them to sites other than the one they selected, and that provide content solely for the benefit of search engines. Google may take action on doorway sites and other sites making use of these deceptive practices, including removing these sites from Google’s index.”

People also use such dummy websites to automatically redirect visitors to the real website and mislead the user about where they are being taken. The dummy website will show up on Google Search Results, but the user is ultimately taken to the real website which has nothing to do with the search results.

Paid Links

Paid Links are basically links which are not earned due to quality, but are instead bought from websites which are willing to make a quick buck. This dilutes the quality of content for the search results and leads to irrelevant or misleading information. Google classifies buying or selling links that pass PageRank, excessive link exchanges, large-scale article marketing or guest posting campaigns with keyword-rich anchor text links and even using automated programs or services to create links to your site as Paid Links. These are all liable for being penalized and even removed from Google’s Search system.

Comment Spamming

Google’s Webmaster Support says that “If you’ve ever received a comment that looked like an advertisement or a random link to an unrelated site, then you’ve encountered comment spam.” Comment Spam is visible on many blogs, which have popular content. Spammers post a comment or remark about the article and surreptitiously insert a link with their own brand or promotion in it. Comments are usually in the form of random praise about the article or some obscure sentence related to the article content and sometimes is outright junk text. All these qualify for the penalty by Google.

A Simple Explanation Of What Big Data Is

A Simple Explanation Of What Big Data Is

Big Data has become a new buzz word in the IT industry. Everyone is talking about it and repeatedly using it to impress others, even if they themselves don’t really know what it means. Big Data is often used out of context and more as a marketing gimmick. This article aims to explain what Big Data really is and how it will be useful in solving problems.

Physics and Mathematics calculations can give us the exact distance from the East Coast of USA to the West Coast, accurate to about 1 yard. This is a phenomenal achievement and has been applied to various technologies in our daily life. But the challenge comes in when you have data which is not static, which is constantly changing and changing at a rate and in volumes which are humongous to determine in real time. The only way we can process this data is by using computers.

IBM data scientists break big data into four dimensions: volume, variety, velocity and veracity. But there are many more aspects of Big Data. Big data can be described by the following characteristics:

Volume is the size of the data which determines the value and potential of the data under consideration and whether it can actually be considered as Big Data or not. Variety means that the category to which Big Data belongs to is also a very essential fact that needs to be known by the data analysts. This helps the people, who are closely analyzing the data and are associated with it, to effectively use the data to their advantage and thus upholding the importance of the Big Data. Velocity refers to how fast the data is generated and processed to be useful. Variability of the data can also be a problem for the analysts. Veracity is the quality of the data being captured. Accurate analysis depends on the veracity of the source data.

Analogies

An article on the Tibco Blog provided a very simple analogy to understanding what Big Data really is. Their blog says that:

“One analogy for Big Data analysis is to compare your data to a large lake… Trying to get an accurate size of this lake down to the last gallon or ounce is virtually impossible… Now let’s assume that you have built a big water counting machine… You feed all of the water in the lake through your big water counting machine, and it tells you the number of ounces of water in the lake… for that point in time.”

A better, more visual analogy is presented by Paul Lewis of Hitachi Data Systems. He often explains about Big Data by showing a picture cartoon filled with hundreds of people who are doing different things in the picture, looking busy. He explains:

“You need to find the person with the suitcase of money (Value)…but there are many people (Volume), all walking at various speeds running to work (Velocity), from all walks of life (Variety), some are crooks (Veracity).”

Importance and Benefits

One of the major reasons why we need Big Data is for prediction and analysis. One of the best examples where Big Data can be seen in action is the Large Hadron Collider experiment, in which about 150 million sensor deliver data 40 million times per second. After filtering and refraining from recording more than 99.999% of these streams, there are 100 collisions of interest per second. Another important example is Facebook, which handles over 50 billion user photos.

Healthcare is another area where Big Data can play a significant role. One of the most amazing example is Google Flu Trends, which analyses search data from various locations and uses the Big Data Analysis to identify patterns of Influenza epidemics and endemics around the world. Although this data is not necessarily accurate or may have a lot of false positives, it highlights the potential of what Big Data can show you.

A key benefit of Big Data is that there is no specific format in which it is stored. Crudely put, it is a raw dump of data i.e. it is unstructured. The system uses complex algorithms to classify and process this data, which makes it very special.

How To Prevent Spam Through Your Website Forms

How To Prevent Spam Through Your Website Forms

We all hate email spam in our email inbox, but rarely try to get to the root of the spam mail’s origin and purpose. Many webmasters face a problem of receiving spam emails through the forms on their website. These are usually contact forms which are setup for visitor’s to post their inquiries or to give feedback. Spammers try to hijack these forms to send spam mails, either by manipulating where the forms sends emails or by flooding the webmaster with junk mail. This article points out some of the ways in which you can fight spam in website forms meant for comments, feedback, inquiries and any other contact.

Form Fields Validation

A very important part of having a secure form is to ensure that there is strict validation of the form fields. This is best explained by an example. When you are accepting a phone number through the contact form, you can code the form to ensure that only numbers are entered in the form field. Similarly for an email address field, the form must be able to determine a well-formed email address has been entered. If the fields contain anything which that field is not supposed to contain, like special characters or some funny text, then the form will throw an error and will not be submitted till the mistakes are corrected. This prevents any malicious code or text from being inserted in the form. It also prevents automated bots from filling up the form without understanding what is required and how it is to be filled.

Captcha

One of the most effective ways of fighting form spam is by enabling a captcha at the end of each form. The captcha requires the user to enter a word or number verification which is shown in an image. This prevents bots and automated systems from sending the forms mindlessly. Since bots cannot usually detect text within images, they fail to enter the correct Captcha text preventing the form from being submitted.

Confirmation Alert

Another simple trick that can be used to harass the spammers is to add a confirmation alert box which pops up to confirm the details that the user is trying to submit. Robots and automatic form submission software are unable to cause the clicking of the confirmation button in the alert box. Something like: Are you sure you want to submit the form? Yes/No Can be an added layer of protection from comment spammers. This is also a good way of allowing users to review the information they are sending and correct any mistakes or typos before submitting the form.

Anti-Spam Plugins

Akismet is an Anti-Spam plugin used for WordPress, which can identify genuine comments and filter out the spam comments. This can be helpful if you do not want to enable a captcha confirmation box or cannot have specific validation on your form. Similar anti-spam plugins are available for other platforms as well. While Akismet type plugins are not 100% correct and may lead to some false-positives, they do a really good job of filtering out the noise.

Logging Information

One of the most important things that a programmer can do to track the sources of spam and patterns of spam is to log additional information of the user. This means that along with the normal fields that the user is submitting, the form will also capture his / her IP Address, machine name, browser details, location information and similar information which can be used to trace the spam source. Using this information you can either take action against the spammer or even block his IP Address. This way he will not be able to keep harassing you or making random submissions.

Some programmers also swear by hidden form fields in CSS, which prevent the form field names from being seen by bots, thereby preventing them from knowing what data to auto fill in the text boxes. Consult your coder for the best solution for your website.