3
\$\begingroup\$

First of all i made a simple code to insert data on my table, with the information from this code i made a loop to insert data on another table for a notification system.

I'm new to PHP and i think that this query that is being executed into a loop is really poor in performance, example: if i have like 100K users, this query will be executed 100K times... But the query will be executed 100K times at same time or is it going to run one after the other? I don't think so, because it's a loop, right? But i still think that my loop is poor in performance, i want to know how much and how can i improve it.

Anyways, what you guys think about my code?

More details is on the comments into the code bellow:

$slug = '';

if(isset($_POST["create"])){ 

  $slug = preg_replace('/[^a-z0-9]+/i', '-', trim(strtolower($_POST["itemName"])));
  $query = "SELECT `slug_url` FROM `table_tudo` WHERE `slug_url` LIKE ?";
  $statement = $conn->prepare($query); 
  $statement->execute(["$slug%"]);
  $total_row = $statement->rowCount();  
  $result = $statement->fetchAll();

  $data = $statement->fetchAll(PDO::FETCH_COLUMN);

  if(in_array($slug, $data)){
    $count = 0;
    while( in_array( ($slug . '-' . ++$count ), $data) );
    $slug = $slug . '-' . $count;
  }

  $insert_data = array(

    ':title'              => $_POST['itemName'],
    ':epTitle'            => $_POST['epTitle'],
    ':itemName'          => $_POST['itemName'],
    ':data'               => $_POST['data'],
    ':datePublished'      => $_POST['datePublished'],
    ':dateModified'       => $_POST['datePublished'],
    ':descricao'          => $_POST['itemName'].' - Tipo '.$_POST['epNum'].' - '.$_POST['descricao'],
    ':epCapa'             => $_POST['epCapa'],
    ':alt'                => $_POST['itemName'].' - Tipo '.$_POST['epNum'].' - '.$_POST['epTitle'],
    ':audio'              => $_POST['audio'],
    ':qualidade'          => $_POST['qualidade'],
    ':epNum'              => $_POST['epNum'],
    ':tipo'               => 'ep',
    ':keywords'           => $_POST['keywords'],
    ':slugForitemPage'   => $slug,
    ':player_SD'          => $_POST['player_SD'],
    ':player_HD'          => $_POST['player_HD'],
    ':slug_url'           => "https://example.com/index.php?page=".$slug.'&ep='.$_POST['epNum'],
    ':slug_link'          => $_POST['slug_link'],
    ':entry_type'         => 'item'

  );

  $query = "INSERT INTO table_tudo (title, epTitle, itemName, data, dataFL, datePublished, dateModified, descricao, epCapa, alt, audio, qualidade, epNum, tipo, keywords, slugForitemPage, player_SD, player_HD, slug_url, slug_link, entry_type) VALUES (:title, :epTitle, :itemName, :data, NOW(), :datePublished, :dateModified, :descricao, :epCapa, :alt, :audio, :qualidade, :epNum, :tipo, :keywords, :slugForitemPage, :player_SD, :player_HD, :slug_url, :slug_link, :entry_type)";
  $statement = $conn->prepare($query);
  $statement->execute($insert_data);

  // this is used on the INSERT query bellow
  $epNum    = $_POST['epNum'];
  $itemName = $_POST['itemName'];
  $status   = 'unread';
  //

  // here i made a query to get all user_id where itemName is = to $_POST['itemName'];, this information 
  // is on a table called `seguiritem`, i will know it by the $_POST['itemName'], that is get when i execute the INSERT query above
  $checkUserFLW = $conn->prepare("SELECT `user_id`, `itemName` FROM seguiritem WHERE itemName = :itemName");
  $checkUserFLW->bindParam(':itemName', $itemName, PDO::PARAM_STR);
  $checkUserFLW->execute();
  $resultRow = $checkUserFLW->rowCount();

  // if the name of the item is on the table, this will execute the code inside this condition
  if($resultRow = 1){

    // i have a table called `noti`, on this table i store the (user_id, epNum, itemName, status), this is
    // a table that i retrive data to a notification system(i think it's not relevant to my question, but 
    // if you think so, place a comment bellow), on the above query i selected all the `user_id` that is 
    // on the table `siguiritem`, with this information i made a loop
    foreach($checkUserFLW as $ckUsFLW){

      //for each user_id that have the column `itemName` equal to 
      // $_POST['itemName'] on the table `seguiritem`, will execute this 
      // query to insert on the table `noti` the `user_id`, `itemName`, 
      // `epNum`(this is like a identifier), and `status` that will
      //  be = `unread`. 
      $user_id = $ckUsFLW['user_id'];
      $queryISNOTI = $conn->prepare("INSERT INTO noti (user_id, epNum, itemName, status) VALUES (:user_id, :epNum, :itemName, :status)");
      $queryISNOTI->bindParam(':user_id', $user_id, PDO::PARAM_INT);
      $queryISNOTI->bindParam(':epNum', $epNum, PDO::PARAM_INT);
      $queryISNOTI->bindParam(':itemName', $itemName, PDO::PARAM_STR);
      $queryISNOTI->bindParam(':status', $status, PDO::PARAM_STR);
      $queryISNOTI->execute();
    }

  }

}

I'm not sure if i explained it well, please any doubts just ask me.

\$\endgroup\$
11
  • 1
    \$\begingroup\$ Move prepare out of the loop \$\endgroup\$ Commented Dec 2, 2018 at 14:08
  • \$\begingroup\$ @YourCommonSense This is all? \$\endgroup\$
    – mario
    Commented Dec 2, 2018 at 14:29
  • \$\begingroup\$ I also want to know what you think about my code, i'm new and if i made a mistake on this code i don't want to repeat it later when i make another code. \$\endgroup\$
    – mario
    Commented Dec 2, 2018 at 14:31
  • 1
    \$\begingroup\$ if($resultRow = 1){ condition is either wrong and useless even if fixed. you can take it out \$\endgroup\$ Commented Dec 2, 2018 at 15:02
  • 1
    \$\begingroup\$ I pretty much reviewed it in the comments and how to insert faster is a question of its own, already covered on stack overflow. Or I've got an example of my own, inserting multiple rows \$\endgroup\$ Commented Dec 3, 2018 at 16:47

1 Answer 1

3
\$\begingroup\$

what you guys think about my code?

Well, because review content is to be positioned in the "answer" section, I'll add my supplemental thoughts to YCS's commented recommendations to obey the StackExchange design and give you an opportunity to mark this page as resolved if you wish.

  1. I don't see any use for $slug = '';. If there is no $_POST['create'], I assume your script doesn't do much at all.
  2. You don't need the case-insensitive pattern modifier on your hyphen substituting process because your are already calling strtolower() on the input. The pattern could be condensed to /[^a-z\d]+/
  3. The first query looks good and tight.
  4. I don't see the need to declare unused variables:

    $total_row = $statement->rowCount();  
    $result = $statement->fetchAll();
    
  5. Your unique slug generating loop looks good and tight.

  6. I would like to question your table_tudo structure.
    • If title and itemName can, in the future, contain different values then it is sensible to design two different columns, otherwise avoid the the table bloat and just use one column to store the title value.
    • If datePublished and dateModified are truly controlled by your users, then it is okay to feed the user's input to these columns, if not I don't see why you shouldn't declare CURRENT_TIMESTAMP as the DEFAULT value in the table schema and avoid writing it in the INSERT query.
    • I don't see the need to re-store itemName and epNum in the descricao column. If in the future, you wish to modify the - Tipo part, then you will have to do a giant string replacement task on your database. Conversely, if you only save the raw $_POST['descricao'] value in the column, you can access the other two columns and perform the concatenation during output only and have full control on how the data is displayed. Basically, I'm advising that you avoid the bloat and potential complication of concatenated data storage.
    • The same point again regarding the alt column. In fact, alt can be removed as a column entirely because all pieces of data are already stored in other columns.
    • If tipo and entry_type are always ep and item respectively, declare them as the DEFAULT value for these columns in the schema.
    • With slug_url, again, I don't endorse the storage of static data in tables. The epNum value is stored elsewhere, and you can very easily hardcode the beginning of the url in your php when it is time to display. I recommend renaming the column to clean_slug or something and just store the raw hyphen-substituted value alone.
  7. I would recommend that $status be written a little further down the script to keep it close to its usage. You might even prefer not to issue a placeholder for it and just write it directly into your prepared statement. However, as I mentioned before, you might best just declare the DEFAULT value in your table schema.
  8. Omit the $resultRow declaration and check as they are not necessary as pointed out by YCS.
  9. As recommended by YCS, move your prepare() line above your foreach() line, because it only needs to be called once and can be used over and over.

We don't know if you are using INNODB or MYISAM, but this may be of interest to you: https://dba.stackexchange.com/q/16395/157408

YourCommonSense's website also has a clean snippet for iterating prepared INSERT queries inside of a transaction which is certainly worth a look. https://phpdelusions.net/pdo_examples/insert#multiple

I have read a few pages on StackOverflow and DBAStackExchange about the potential benefits of performing batched inserts in an effort to reduce total trips to the database, but I didn't find a definitive resource to link to and I'll not venture to post firm any advice. I think if there is any benefit there is a sweet spot around 5000 rows due to autolocking, but again I'm not going to expose myself to any critique on this subject.

Finally, if your notification system only need to know if a user should be notified or not (the necessary data is binary) and the alert is only pushed once, then perhaps a rethink of how the alert data is stored is in order. If you don't need to perform any meaningful queries on the noti data, then perhaps you could be happy with INSERTing a single row with epNum and itemName which is connected to a csv or json string of user_id values. Do you actually need to track the read status on millions of rows? Maybe your notification system could be just as happy chewing on a file that contains csv or json data. There are always trade offs to sacrificing normalized table structure for performance, but you may need to improve the user experience for this script and experience slight lag somewhere else that the user doesn't experience it at all. (sorry that this part is rather vague)

\$\endgroup\$

Not the answer you're looking for? Browse other questions tagged or ask your own question.