Popular Content

Showing content with the highest reputation since 03/23/2023 in all areas

  1. Dear Ninja, It's been a very turbulent last couple of months, we started having server issues out of nowhere in late November 2022. This was especially surprising and disorienting because at that point, we had not made any code changes to the server since about July 2022. We were sent on a goose chase to figure out why these issues were happening, and without our dear beloved (rest in peace) Robin to come rescue us from issues like this as he had in the past, we had an especially hard time. But I want to write this development log to explain why this issue was so pervasive, evasive, and abrasive, so players understand why this wasn't a simple "fix your server" kind of issue. To understand the rest of this post, you need to know what the following terms refer to: Server Software - Our proprietary, in-house server software we wrote to run authoritative logic for Nin Online and to handle networking aka. keeping players connected and sending data between them. Server Hardware - The actual machine that we rent to run the server software. This hardware belongs to a third-party company. Third Party Service Providers - Services that provide us with databases for the post part in our case, but can refer to any company that provides SaaS. DDoS - Distributed Denial of Service Attack, typically on a server, to prevent normal operations. SYN - Clients requests connection to a server by sending SYN (synchronize) message to the server. ExitLag - A shady company Phase 1: Locating Fault With a software as complicated as Nin Online. There's a lot of places fault can be found. All we knew based on player reports was the following... There are possibilities in third party service providers, client, server hardware, server hardware (OS, Networking), it could come down to almost anything. First I'll talk about what we tried in Phase 1. Restart Server Software Restart Server Hardware Check and reboot all third party service providers (MongoDB, MySQL) So basically all the things you do when you have faulty technological issues - "Have you plugged in out and plugged it in?" "Have you tried restarting your modem?" The next thing we tried was to make sure it wasn't an isolated issue with that specific server. We rented a new VPS server and hosted Nin Online there for awhile to see if it was something that was solvable that way, and if it could be down to Windows Server settings, an issue/change to do with our hosting provider, DNS issues, anything that could be isolated to server hardware/provider. This was not the case, so we moved back to the original server. From this, we diagnosed that the issue must have to do specifically with our server software. Because it happened on multiple different server hardware. A further clue was that Nin Online's Brazil Server was fine, and the Brazil server didn't have a lot of code changes that Nin Online NA had. So it was a good lead. Despite the fact that we didn't make any code changes, it is not impossible that existing code changes between that span of time could've started acting up later. Phase 2: Looking through code changes We first looked at content changes. Nin Online's engine gives a large amount of freedom to developers to create content on the fly. Although no code changes were made. It was completely possible that it was caused by a content change eg. A new map, a new item, a new NPC. But nothing really aligned with the timeline that would cause the bug. There was one thing that stood out... Erox had just launched the Christmas Event, and this year was the first time we had pathfinding changes. This led to the train of thought that perhaps it could've been that a massive amount of NPCs (Zombies) was causing the server to hang for a long amount of time, and during this time, the server could not send any data to players - hence the hanging. The caveat to this is that our pathfinding is threaded. Meaning, even if the pathfinding was hanging, the rest of the server processes should've continued fine. But to be safe, we decided to first disable all A*Pathfinding. We left the server online to see if it stopped, but it persisted. We later went back to the drawing board many times, looking at what content or code could have changed. We investigated if Erox has added any events/npcs/items etc. and forgot about them. (he didn't) The next thing I tried was to look through all the error logs that the server created. There were a dozen or so errors the server was throwing that seemed inconsequential. These could be things like a projectile/jutsu trying to target a player that was already disconnected. The server would normally ignore these errors. But I fixed them just to be sure. This didn't help either. After a few days, after discussing it with Wolf, we thought perhaps the issue could be due to threading pathfinding entirely. Threading it in the first place was a risky idea, even though necessary, because as I said, A*Pathfinding is expensive. So we decided remove threading for pathfinding. This didn't solve the issue, but it did mislead us for a few months. Phase 3: Completely misled A few weeks later, to no avail in solving the issue, we started looking to other data. Sadly, as we'll soon find out data lied. We looked at server performance while the server was having these hanging/spikes/disconnection mass events. We did this by profiling the server, looking into metrics we have been collecting for years, and we found that during increased player activity, the server showed obvious signs of degradation. I'm skimming through weeks of work to collect data, but basically our findings were that was a correlation between these hanging issues and degradation of server stats, namely TPS. The server was running less ticks per second when these issues were occuring! Hoorah, if we can figure out what is causing this, we can solve the issue. We spent weeks figuring out what the issue was that was causing the server performance drop. Clearly something was wrong with it, if we solved it, we would most definitely fix it... right...? So we started looking into the call stack and performance profilers to figure out what was causing the drops in performance. We looked at what changes could have been made around late November that could cause it. (Just note that although the graph looks like it only shows degradation in mid-December, this is only obvious now that we have a lot more data than we did in December. We found that certain packets ran processes on the server that were taking a long time to process. Namely packets/processes that interacted with MongoDB. So I spent a few days moving this these processes to Jobs (basically threading). It was possible that due to these packets not being threaded would cause a long pause where the server was just processing these on the main thread - hence causing the hanging. Unfortunately, this did not resolve the issue either. The server was optimized. There should be nothing left that took this long to process that it was bring the TPS down... except... Pathfinding. We later realized that the reason TPS was down was simply because we stopped threading Pathfinding. Pathfinding was so expensive that it single handedly was bringing TPS down more than anything. We sent ourselves on a goosechase because of what we had did in Phase 2. In hindsight, of course this was the case. But we were trying and doing so many things at once we lost track of what we had changed a little and we forget to go back to basics. We were consumed with the idea that the TPS was causing the login issues, when it wasn't. It was months of stress over trying to figure out what in the server was causing the TPS to drop that much, and it was just pathfinding. I'm glossing over days of me and Wolf diagnosing server performance. Running third party software on the server software to figure out what was causing it to hang. But this was work. Real work. Phase 4: Back to basics We went back to the drawing board and looked to what we knew as fact. The timeline of everything we knew and decided to look into what was happening during one of the disconnection events. We let the server fail in debug mode, so we could look into the internals of the server while it was having disconnections. Up until this point, the server was still functional for most of the day, it was just crashing every 24 or so hours. I was on full-time watch for the server, making sure it went back up when it was down. It had been months of this, it was stressing me out a lot. We noticed that the server had a large amount of connection sockets (TCP connections usually used to send data between client to server and vice versa). We started looking at what code issues within our login system that could be causing them to pile up without clearing. We spent weeks on this, making potential changes and hopeful fixes, to no success. One of the hardest parts of this issue was that we could not recreate it locally, so we had to rely on the live server to debug it. Each time we tried a code change, we had to wait until the next time the server crashed, so there was a lot of time when we could not do anything but wait for another crash. Sometimes code changes we made seemed to work, but really didn't. The bug would not appear for a few days, or even a full week, and then suddenly happen again. So we were constantly being thrown into "Yay we fixed it!" and "Fuck it's back". The only clue we still had was, no matter what we did, these connection sockets leaks were still happening. Phase 5: Player testimony We went back to player testimonies, hearing what people were experiencing and getting footage of what was happening what all was going down. We heard people tell us it was probably to do with Automated Tournaments, Quick Logins and various other features. So we went through rounds of disabling things and re-enabling things until we could find what was wrong. Eventually, the bug for some reason seemed to change it's modus operandi. It started manifesting as log out issues instead of login issues. Players who were logged in, were not having their characters ever log out... Curious. Phase 6: Discovery Not all players were having logout issues when there were little players online. But once there were a ton of players online (around 100) there started to be widespread logout issues. Because players were being stuck on logout, I started investigating why they weren't being logged out, since it mainly happened to a small number of individuals when the server was fresh and not very populated, I started with those users. I found that the players were having ping packets sent even when they told me that they had the Nin Online client closed. That's literally impossible I thought. Without the Nin Online client open, what could possibly be pinging the server...? ExitLag. Phase 7: ExitLag ExitLag around November last year, started using a method of "optimization" that was essentially, on scale, a SYN Flood Attack. The culprit was a third-party software that wants desperately to provide "better ping" for players. So it uses a combination of techniques to do so. One of those includes using multiple relay servers to send the same packets to our server, spamming our server with unnecessary information multiple times. It sends dozens of SYN packets per second to our server through the port our game client uses to connect to the server. It does so through distributed servers across the world. About SYN Flood Attacks https://www.cloudflare.com/learning/ddos/syn-flood-ddos-attack/ https://en.wikipedia.org/wiki/SYN_flood It doesn't even hide it. In the picture above, it shows that it's established multiple connections to our server and is constantly sending and receiving unnecessary data through it. What's scary is (we've not fully investigated this claim) but the software seems to also triple the amount of data our server sends for large packets of data like Map data. IP address/connection slot of someone using ExitLag and their source port number The reason why it was causing login issues was because it was filling up all the temporary slots allocated to TempPlayers (a method we use to verify and give real players a slot in the game) because the server had no choice but to check all these empty packets that was being sent. The reason it was causing logout issues for players not using ExitLag was because it was overwhelming the disconnection system, blocking the disconnection queue and causing a threading leak issue which was slowing down the server. A normal DDoS attack would've been quickly triggered by our DDoS protection we had in place since 2013. But because this was done at the authentication level (it wasn't spamming packets, it was spamming SYN packets) it was creating a lot of new issues. Our DDoS protection was "per connection" whereas this was creating new connections constantly. Another thing that really pissed me and Wolf off is that this isn't the first time ExitLag's methods have caused us issues. It was causing our server to throw errors in the past, and so we actually built workarounds. If only we had straight up banned it then. Lastly, we unfortunately had an issue with timing out TempPlayers. The intervals our KeepAlive packets were being set at was 30000-60000 seconds instead of 30-60 seconds. Which is a dumb mistake on our part. This made the logout issues much worse, but also just aggravated the issue of us being DDoSed. We never found that mistake for 5 years before this because we never had this issue. Phase 8: Resolution Now that the following have been put in place, this should prevent a future SYN flood attack and also ensure players aren't accidentally banned by using ExitLag. We've tweaked Windows Server's provided SYN Flood Protection capabilities to suit Nin Online With the help of ChatGPT, I wrote a new application that checks for SYN Floods and quickly (within a minute) bans IP addresses that are flooding our server We've banned ExitLag from being used with Nin Online, so players don't accidentally get their IPs banned. We fixed a bug that was causing KeepAlive packets to only be sent out every few hours. So even if a SYN Flood Attack happens, it will not cause widespread logout issues. We've contacted ExitLag to remove Nin Online from their listings. This has been one of the hardest 5 months of developing Nin Online. I've been on full time "make sure the server is not malfunctioning" duty for the entire time, and I've been caused severe mental distress by this ordeal. All this to say, I don't like ExitLag. Thank you to Wolf, Delp and all the players who have been helping provide information for the help in solving this issue. Regards, Ueda
    7 points
  2. March 2023 Promotions Leaf Village Jonin: @Tommy @Genshin @Eymon Arashi Chunin: @Stampede @Viento @Genocidal @Steam bun @Don Tormenta Specialized Jonin: @Emptyad Sand Village Jonin: @Sirch @Jugram Chunin: @Ohta @Vaunt @Bakuton @Akurose Mist Village Chunin: @zackzack @Yuma
    5 points
  3. Suna Laws 1. Do not kill or attack another Sand Ninja for any reason. This includes directly causing them to die with manipulating surroundings (such as leading a bunch of mobs to them and then cloaking away) This does not include killing them in a mutually agreed upon spar. 2. You must aid a fellow Sand Ninja. Of course, there is situation where you can't be helping a fellow Sand Ninja wich is understandable but you must do your best. Do not place a bounty on a Sand Ninja. Do not assist the enemy against Sand Ninja in any way. 3. Verbal harassment is not tolerated. If someone is breaking Nin Online rules, take it up with the GM's. It's not the player's responsibility to deal with this. (Do note, SMPF officers are able to jail you for general toxicity). Two jailing's result in a strike. Three strikes is an exile! 4. Respect and follow the commands of your ranked superiors. Here's a reminder of the chain of command : TITLES : Kage -> ANBU Leader -> Council -> PB Leader -> Police Chief -> MedCorp Leader RANKS : Kage -> Jonin -> Chunin -> Specialized Jonin -> Genin -> Academy Students In general, title > rank. 5. You must respect the village and its members. 6. Above all, respect and follow the orders of the Kazekage. These are general rules to follow, the severity of punishments are to be decided after meetings. _______________________________________________________________________________________________________________ PARDONS Pardon fee is 1k ryo up to 10k ryo. It is up to me if i want to pardon you or not. Do not force me to do anything in DMs PEACE LIST Sand have no peace list. Feel free to kill anyone who is not a sandie.
    2 points
  4. I saw that there are utility pouches and vests for Kirigakure (it doesn't seem like Konohagakure and Sunagakure has them yet), but I heard they didn't serve a purpose beyond fashion. Normally, items like these would increase inventory capacity for any item to occupy, but I don't believe they should have that function here nor should they be purely cosmetic. Here are a few ideas I thought of for the utility vest and pouch: Utility Pouch Provides two inventory slots that are separate from the 'Inventory' window as a tab attached to it when it is worn. These separate inventory slots can only hold ninja tools. The amount of each tool the pouch is capable of holding could be lesser than the amount the ninja's own inventory is capable of holding. (I.e., 100 or 50 ea.) The ninja tools in the pouch's inventory could also be counted with the ninja tools in the ninja's own inventory. (I.e., If a ninja has 150 shuriken in their inventory and 50 in their pouch, they cannot hold any more as those two amounts combined amount to 200.) Utility Vest Provides three or four inventory slots that are separate from the 'Inventory' window as a tab attached to it when it is worn. These separate inventory slots can only hold ninja tools. Could require the 'Chunin' or above rank to be worn. The amount of each tool the pouch is capable of holding could be lesser than the amount the ninja's own inventory is capable of holding. (I.e., 100 or 50 ea.) The ninja tools in the pouch's inventory could also be counted with the ninja tools in the ninja's own inventory. (I.e., If a ninja has 150 shuriken in their inventory and 50 in their pouch, they cannot hold any more as those two amounts combined amount to 200.) This would free up space within a ninja's own inventory by allowing ninja tools to be held within the utility pouches and vests by a lesser amount as suggested above, but I feel like this should come with a few changes such as a requirement, drawbacks I already listed above, and consequences: The price of the utility pouch and vest should be increased to compensate for convenience they will (could) provide. The utility pouch could continue to be available for purchase by any ninja of any rank, but the utility vest could have a 'Chunin' or above rank requirement as they are typically the experienced ones who commonly wear this type of vest. Whenever a ninja dies with ninja tools within their utility pouch and/or vest, an enemy ninja can loot their ninja tools from them. The enemy ninja cannot loot more ninja tools than what they're capable of holding. (I.e., If the enemy ninja has 147 kunai and the fallen ninja has 64 kunai within their utility pouch or vest, they will only be able to take 53 kunai—leaving 11 kunai behind.) How these ninja tools drop is up to the developers' discretion, but here are two examples I have in mind: They could drop out of the utility pouch's and vest's inventories as any normal item would. Half of the amount of ninja tools carried could drop instead of all of them. (Developers' Choice.) They could remain in the utility pouch's and vest's inventories, but the enemy ninja will have to remember to check them (press their loot key on top of them) for any ninja tools. (This has the chance to prevent the fallen ninja's tools from being looted by quickly respawning before they can be checked, so there may need to be a longer delay before choosing a respawn option is possible.) Half of the amount of ninja tools carried could be taken instead of all of them. (Developers' Choice.)
    1 point
  5. I, too, wish to register as a council candidate
    1 point
  6. This one's a simple suggestion—something that would help other ninja have a bit more awareness of their fellow ninjas. In addition to the one that shows us how long a fainted ninja has left before they die, there could be two additional status indicators for showing that a ninja has dozed off (AFK) and is talking (typing): Something like this icon could appear when a ninja has been idling for too long—no action nor clicking—to the point of nearing automatic disconnection (if that's a thing, I forgot). We could be able to manually trigger this by typing '/afk'. (This may be exploited to trick enemy ninja, however, that is a common tactic for launching surprise attacks on those that actually believe they're asleep.) Something like this icon could appear when a ninja is talking (typing in '/say'), but how it should appear is up to your—the developers'—discretion. My suggestion is to not let it be seen by other ninja unless those ninjas are an adequate number of tiles in range of the ninja that is in the middle of talking (typing). This would teach ninja to whisper rather than talk aloud (type in '/say') when hiding even though their words ('/say' message) would give them away anyway. This indicator should not appear when a ninja is talking (typing) in any other chat channel that isn't '/say'. These two would help us know when a ninja isn't consciously present (AFK) and when a ninja may be trying to talk to someone that's passing by, but that someone does not know if the ninja is prepared to say something to them or not.
    1 point
  7. Now that we can spawn in houses to lower our BI's, Houses are way more used in the whole game, some people even rents a house between two (Host and Co-Host) to split payment and expenses (On Furniture for example), sounds very beautiful and cute But currently, Co-Host's do not have the same perks as the House owner, even though you "Own part of it as well" So, why not making Co-Host to be able to spawn in the house too with reduced BI as well and limit it to just 1 Co-Host per house? (Maybe it is 1 per house, would preciate if someone can confirm) Ty for reading btw
    1 point
  8. It's been a while since i made a forum post but as the title says lets have a Nin Easter Event to start the year right. i know most OG players like myself would like to see a new take on how events are and some spice to the game. Here is my ideas of what the easter event could be like: 1. Each village warzone has a unique rabbit boss so example: sand rabbit boss is wind based, Leaf rabbit boss is fire based, and mist rabbit boss is water based. 2. Each rabbit should give a minimum of 30kexp to make it worthwhile in killing it and exp should be shared to a party to give good incentive to kill it. Now i know in the past event bosses have the same old mechanics so i would like to see a change whereas every player at least lvl 20 have a chance to kill the rabbit boss with their comrades. 3. Each boss can drop a unique easter theme weapon or clothing example: easter themed auras, rabbit robe, rabbit paw shoes, chocolate themed katana, egg crusher tonfas, rabbit ears you know something like that, it could be low lvl based weapons nothing crazy. 4. Easter eggs can be spread around the danger zones , warzones where players can break them to get coupons, blanks, pills, food items whatever. This will give players incentives to enter the danger zones to break these eggs making a more healthy and active danger zone and creating fun. 5. This is just my personal opinion but i think nin should really utilize the community more and make a competition for art to be added for this easter event so example the top 5 pixel who makes the best easter theme weapon or cosmetic can get silver role, or a special title unique to them and get paid with nin credit or a ryo prize pool to encourage them to take part is such events because i think including the players in a design of event themes creates a healthy environment for creativity and fun.
    1 point
  9. I try login but not allow me come to login screen. I dont know what to do i am so sad , i wanna play game , it show me when i click on login button
    0 points
  10. I removed all the seals from the Fuu mission and she told me to go check out the cave but there is no cave loaded in on my map. It’s just a blank wall
    0 points
This leaderboard is set to Toronto/GMT-04:00